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DECLARATION OF DR. JERRY B. DODGSON PURSUANT TO 37 CFR 1.132 



I, Dr. Jerry B. Dodgson, hereby declare as follows: 

1. I am a skilled practitioner in the field of poultry molecular biology. I 
currently hold the position of Professor at Michigan State University in Lansing, 
Michigan. My professional experience, educational background and list of publications 
are detailed in attached Exhibit A. 

2. I have reviewed the above-referenced patent application (hereinafter the 
"Application") and I have knowledge of the invention described and claimed in the 



Date 




(DODGSON DECLARATION) 



Sir: 
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Application. 

3. I understand that the Examiner has indicated that the Application does not 
enable making any transgenic avian because the Examiner believes that undue 
experimentation would be required to produce transgenic avians other than chickens in 
accordance with the invention. 

4. Based on my knowledge in the field, though the invention may not 
necessarily work for all avians, I believe that the claimed method could be adapted by a 
person of skill in the art to be useful to produce transgenic avians other than chickens. 
For example, I would anticipate a likelihood of success with the claimed method for other 
domesticated birds which are of reasonable size to perform the surgical procedure. Of 
course, there are certain birds that may not breed in captivity and being able to breed 
birds in captivity is a requisite for the claimed invention. In addition, a bird such as a 
hummingbird might be too small on which to perform the surgical procedure. 
Nevertheless, I believe a person skilled in the art, who has read the Application, would be 
able to produce transgenic chickens and transgenic avians other than chickens by 
microinjecting into a cell of the avian embryo a nucleic acid that includes a trans gene and 
introducing the micro injected avian embryo into an oviduct of a recipient hen, such that 
the recipient hen lays a shelled egg containing the microinjected avian embryo. For 
example, though there are an estimated 9600 avian species, the basic structure of the 
embryo among avians is essentially the same. That is, the unfertilized egg contains an 
ovum or germinal disc that rests atop a yolk. In addition, the female reproductive tract 
among avians is essentially the same. In particular, the ovum is fertilized as the yolk 
passes through the infundibulum of the reproductive tract of the avian and the yolk 
continues on through the magnum where it becomes coated with egg white. In the 
second paragraph at page 22 of the Application it is stated that fertilized ova, and 
preferably stage I embryos, are isolated from euthanized hens between forty-five minutes 
and four hours after oviposition of the previous egg. Though the optimum times may 
vary amongst avians for isolating a stage I embryo, a skilled practitioner in the art would 
be able to follow these directions to obtain embryos, such as stage I embryos, from avians 
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other than chickens which can be injected in accordance with the invention. The 
experimentation required to determine when to isolate an embryo such as a stage I 
embryo from a certain avian genus for use in accordance with the Application would be 
routine. The introduction of the injected embryo into a recipient female avian (hen) 
could be accomplished by a practitioner of ordinary skill in the art after which a hard 
shell egg would be laid by the avian from which a transgenic chick would hatch. 

5. Certain nucleic acids that have been described in the Application as useful 
for producing transgenic avians by microinjection in accordance with the invention are 
not nucleic acids which would be preferential for producing transgenic chickens over 
other transgenic avians. Example 6 at page 75 describes pAVIJCR-Al 15.93.1.2, a 
plasmid vector that is based on the Bluescript vector. Example 1 at page 71 describes the 
use of pAVIJCR-Al 15.93.1.2 to make a transgenic avian. It is very likely that 
pAVIJCR-Al 15.93.1.2 itself with no modifications would work in avians other than 
chickens for the production of interferon since magnum specific promoters (e.g., chicken 
ovalbumin, ovomucoid and lysozyme promoters) would likely function in the magnum of 
avians other than chickens. In any event, by reading the Application, a practitioner of 
ordinary skill in the art would be able to make a plasmid, or other DNA construct, useful 
for producing transgenic avians other than chickens by microinjection. For example, a 
constitutive promoter such as a CMV promoter which is described at page 13, second full 
paragraph of the Application could be used. CMV will function in the chicken as can be 
seen in several of the Examples 10 to 24 of the Application. CMV is not a chicken 
specific promoter and as such, since the promoter functions in chickens, I would expect 
the CMV promoter would function in the oviduct of most or all avians. Therefore, I 
believe that production of transgenic avians in accordance with the claimed invention is 
not limited to chickens and would include birds such as turkeys, pheasants, quails, duck, 
ostriches and other commonly bred poultry. 

6. I understand the Examiner has made certain rejections based on lack of 
enablement with regard to promoters which are claimed for use in the invention. 
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7. I have read the definition for "magnum specific" promoter at page 13, 
lines 10 to 12 of the Application where it is stated that "A "magnum specific" promoter, 
as used herein, is a promoter that is primarily or exclusively active in the tubular gland 
cells of the avian magnum." It is well known in the art that avian lysozyme promoters, 
ovalbumin promoters, ovomucoid promoters and ovomucin promoters are primarily or 
exclusively active in the tubular gland cells of the avian magnum and as such are 
magnum specific promoters within the stated meaning. In addition, magnum specific 
promoters such as avian lysozyme promoters, ovalbumin promoters, ovomucoid 
promoters and ovomucin promoters had been characterized and were known in the art to 
be "magnum specific" promoters prior to the Application filing date of September 18, 
2002. Promoters such as CMV and RSV are constitutive promoters and are not primarily 
or exclusively active in the tubular gland cells of the avian magnum and as such are not 
magnum specific promoters. 

8. I understand that the Examiner in the case has rejected claims as being 
indefinite with regard to certain terms present in the claims including the terms 
"optimized for codon usage", "structural polypeptide", "thioredoxin" and "polyhistidine". 

9. A practitioner of ordinary skill in the art would know that the phrase 
"optimized for codon usage"l^fer^To~^6dihg~sequence for wfficlTthe codons Tor a 
certain protein have been selected based on the preferential codon use during protein 
translation by the organism in which the coding sequence is being expressed. 

10. A practitioner of ordinary skill in the art understands that the phrase 
"structural polypeptide" as used in originally filed claim 23 refers to a polypeptide 
produced by an organism where the protein is part of a structural component of the 
organism, opposed to soluble polypeptides, for example, enzymes, which are typically in 
the cytoplasm or other non-structural cellular compartment and are not part of a structural 
components of the organism. 

11. A practitioner of ordinary skill in the art understands that thioredoxin and 
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polyhistidine are each a separate example of a peptide or protein useful for isolating a 
heterologously expressed protein. A practitioner of ordinary skill in the art understands 
how thioredoxin and polyhistidine can be used to isolate heterologously expressed 
proteins. A practitioner of ordinary skill in the art also understands that thioredoxin is not 
a polyhistidine. Thioredoxin is typically a protein of 108 amino acids which contains an 
active center and disulfide linkages. Polyhistidine is a simple peptide consisting of 
histidine residues, for example, six to eight histidines in length. 

12. I hereby further declare that all statements made herein of my own 
knowledge are true and that all statements made on information and belief are believed to 
be true and further that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment or both under 
section 1001 of title 18 of the United States Code and that such willful false statements 
may jeopardize the validity of the application or any patent issuing thereon. 
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EXHIBIT A 



JERRY B. DODGSON 

Department of Microbiology & Molecular Genetics, Michigan State University 
2209 Biomedical & Physical Sciences Building 
East Lansing, Michigan 48824-4320 

Education 

Undergraduate: Michigan State University (1965-1969) 

1969 B.S., Major in Chemistry 
Graduate: University of Wisconsin (1969-1970 and 1972-1976) 

1976 Ph.D., Major in Biochemistry 
Postdoctoral: California Institute of Technology (1976-1979) 

Dept. of Chemistry 

Research and Teaching Experience 

1970-1972 U.S. Army; Biological research technician (enlisted personnel); U.S. Army Research 

Institute for Environmental Medicine, Natick, MA; Wayne Evans, group supervisor. 
1976-1979 Postdoctoral Research Associate; Department of Chemistry, California Institute of 

Technology, Pasadena, CA; Norman Davidson, postdoctoral mentor. 
1979-1983 Assistant Professor, Departments of Microbiology and Biochemistry, Michigan State 

University, East Lansing, ML 
1983-1988 Associate Professor, Departments of Microbiology and Biochemistry, Michigan State 

University, East Lansing, ML 

1987- 1998 Director, Michigan State U. Biotechnology Research Center 

1988- 2003 Professor, Departments of Microbiology and Biochemistry, Michigan State 

University, East Lansing, MI 
2003 -present Professor, Department of Microbiology & Molecular Genetics, Michigan State 
University, East Lansing, MI 48824. 

1989- 1990 Acting Chairperson, Department of Microbiology, Michigan State University, East 

Lansing, MI 48824 

1990- 2003 Chairperson, Department of Microbiology & Molecular Genetics, Michigan State 

University, East Lansing, MI 48824. 

Additional Professional Experience 

Member, USDA-ARS Avian Disease and Oncology Lab User Liason Group, 1999-present 

Review Panel Member, NP101 Food Animal Production: Genetics, Genomics, and Germplasm, 
USDA-ARS Office of Scientific Quality Review, 2002 

US Review Panel Member, Animal Production, BARD Fund, 1998-2000 

Panel Manager, Animal Genome Mapping and Genetic Mechanisms, USDA- 
NRICGP, 1998 

Member, USDA-CSRS-NRI Animal Genome Mapping and Genetic Mechanisms Review Panel, 
(June 1994, May 2000, October 2005) 

Member, USDA Animal Molecular Biology Review Panel, (May 1987) 
Member, N.LH. Genetics Study Section, 1988-1992. 
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Ad Hoc Reviewer; N.I.H. Study Sections: Molecular Biology (February 1982 and June, 1983), 
Genetics (October 1983, June 1986), and Genetics/SBIR (Nov. 1998). Special NIAID Innate 
Immunity Program Project Review Panel, May, 2004; NIAID Biodefense and Emerging 
Infectious Disease Program Project Review Panel, Feb., 2005. 

Instructor; Cold Spring Harbor Molecular Cloning Course, Summer 1982. 

Member, Peer Review Board, Alabama Research Institute, 1985-1987. 

Associate Editor, Poultry Science, 1995-2001 

Editorial Board, Journal of Heredity, 2003 -present 

Occasional ad hoc reviewer: Nature, Proc. Natl Acad. Sci. USA, Nucleic Acids Res., J. Biol. 
Chem., Gene, USDA, NSF, VA, Genomics, Animal Genetics, other journals 

Member, NC-168 Regional Research Committee, 1989 to 2003, member NC-1008, 2003 to 
present; occasional ad hoc participant, NE-60 and NCR-150 

Coordinator of Poultry Genome Mapping, USDA, National Animal Genome Research Program, 
NRSP-8; 1994-present, (Co-coordinator, 1993-1994) 

Member, USDA Poultry Genome Mapping NAGRP Species Committee, 1992-present 
Member, Avian Genetic Resources Task Force, 1996-2000 



Honors and Awards 

National Merit Scholar (1965-1969) 

National Science Foundation Fellowship (1969-1970 and 1972-1974) 
Wisconsin Alumni Research Foundation Fellowship (1974-1975) 
Wharton Fellow (1975-1976) 

American Cancer Society Research Fellow (1976-1978) 
National Institutes of Health Postdoctoral Fellow (1978-1979) 
Research Career Development Awardee, N.I.H. (1984-1989) 

MSU College of Natural Science Alumni Assoc. Distinguished Faculty Award, 1990 
USDA Group Honor Award, 1997, jointly presented to NAGRP Species Coordinators 
MSU College of Natural Science Distinguished Faculty Award, 2001 

Merck Award for Achievement in Poultry Science 2003, presented by the Poultry Science 
Association 

Pfizer Award for Research Excellence 2005, presented by the MSU College of Veterinary 
Medicine 
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Society Memberships 

AAAS 
ASM 

Poultry Science Assoc. 

American Genetic Assoc. 

International Society for Animal Genetics 

Invited Speaker for the Following : 

3/80 Department of Biological Sciences, Oakland University, Rochester, MI 

3/84 Department of Biological Sciences, Oakland University, Rochester, MI 

8/80 EMBO Workshop on RNA Processing in Eucaryotic Cells, Arolla, Switzerland 

9/80 Department of Biological Sciences, Wayne State University, Detroit, MI 

1/81 Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 

8/8 1 Red Cells Gordon Conference, Plymouth, NH 

1/82 Department of Chemistry, Bowling Green State University, Bowling Green, OH 
7/82 Cold Spring Harbor Cloning Course, Cold Spring Harbor, NY 

3/84 Department of Biochemistry, Molecular and Cell Biology, Northwestern University, Evanston, 
IL 

3/85 Department of Biochemistry, Wayne State University School of Medicine, Detroit, MI 

3/85 Frederick Cancer Research Institute, Frederick, MD 

8/85 Department of Anatomy, University of California, San Francisco, CA 

3/86 Reunion Honoring Norman Davidson, Asilomar, CA 

4/89 Department of Molecular Biology, Wayne State University, School of Medicine 
9/90 NC-168, Regional Research Committee Meeting, Michigan State University 
9/91 NC-168, Regional Research Committee Meeting, Purdue University 
9/92 NC-168, Regional Research Committee Meeting, U. of Wisconsin, Madison 
9/93 NC-168, Regional Research Committee Meeting, North Carolina State University 
9/94 NC-168, Regional Research Committee Meeting, Minneapolis, MN 

8/94 Session chairperson and workshop chairperson, 5th World Congress on Genetics Applied 

to Livestock Production, Guelph, Ontario 
9/94 Meeting organizer, 2nd annual meeting of the National Animal Genome Research 

Program, Minneapolis, MN 
10/95 NC-168, Regional Research Committee and National Animal Genome Research Program 

Meeting, College Park, MD 
1/96 Animal Science Department, Michigan State University 
5/96 Poultry Breeders' Roundtable, St. Louis, MO 

7/96 Symposium on "Genetic Selection - Strategies for the Future", Annual Meeting of the 

Poultry Science Association, Louisville, KY 
9/99 NRI Workshop on Animal Genome Resource Development, Washington, DC 
1/00 to 1/02, Organizing Committee, Plant and Animal Genome Annual Meetings 
8/00 Invited speaker, World Poultry Congress, Montreal, CA, 
10/00 Invited speaker, Texas A&M University, Genetics Program 
1 1/00 Invited speaker, Embrex Corp., Durham, N.C. 

01/02 IFAFS grant presentation, PAG-X Workshop on Government Funding 

02/02 Consultant, National Academy of Sciences Workshop on Domestic Animal Genomics 

12/02 Consultant and MSU rep., USDA-ARS Stakeholder Group meeting , Beltsville, MD 

01/03 Invited speaker, Large-insert DNA Libraries and their Applications Workshop, Plant and Animal 

Genome XI meeting, San Diego, Ca 
02/03 Invited speaker, Advances in Genome Biology & Technology Meeting, Marco Island, FL 
10/03 Invited speaker, Livestock Genomes Symposium, Del Lago Resort, Lake Conroe, TX (2 

presentations) 

1 1/03 Invited speaker, Chicken Genome Sequence Symposium, Atlanta, GA 

04/04 Invited participant, Chicken Genome: New Tools and Concepts, Kansas City, MO 
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06/04 Invited speaker, Int'l Symposium on Avian Endocrinology, Scottsdale, AZ 

07/04 Invited speaker, PSA Annual Mtg. Ancillary Scientists Symposium on Agricultural Biosecurity: 

Emerging Issues in Homeland Security and Food Safety, St. Louis, MO. 
09/04 Invited Participant/Speaker, USDA Animal Genomics Workshop, Washington, DC 
10/04 Invited speaker, Virginia Polytechnic Institute, Cell & Mol. Biol., Blacksburg, VA 
12/04 Participant (with R. Wilson, L.W. Hillier and C. Ponting) in a phone-in news conference 

arranged by Nature in connection with the publication describing the chicken genome sequence. 
05/05 Session chairperson, Chicken Genome and Development Workshop, Cold Spring Harbor Lab. 
09/05 Invited seminar speaker, Microbiology and Molecular Genetics, Michigan State U. 
09/05 Invited keynote speaker, UC Davis Chicken Genome Biology-MDV Pathology Symposium 
1/97 to present, annual presentations to NC-168/NRSP-8 Regional Research Comm. mtgs. 

(NC-168 changed to NC-1008 in 2003) 



Publications 

1. Burd, J. F., R. M. Wartell, J. B. Dodgson, and R. D. Wells. 1975. Transmission of stability 
(telestability) in deoxyribonucleic acid. J. Biol. Chem. 250:5109-5 113. 

2. Dodgson, J. B., I. F. Nes, B. W. Porter, and R. D. Wells. 1976. Two new genetic assays for 
noninfectious fragments of oX174 DNA. Virology 69:782-785. 

3. Blakesley, R. W., J. B. Dodgson, I. F. Nes, and R. D. Wells. 1977. Duplex regions in "single- 
stranded" 0X174 DNA are cleaved by a restriction endonuclease from Haemophilus aegyptius. 
J. Biol. Chem. 252:7300-7306. 

4. Chan, H. W., J. B. Dodgson, and R. D. Wells. 1977. Influence of DNA structure on the lactose 
operator-repressor interaction. Biochemistry 16:2356-2366. 

5. Dodgson, J. B., and R. D. Wells. 1977. Action of single-strand specific nucleases on model 
DNA heteroduplexes of defined size and sequence. Biochemistry 16:2374-2379. 

6. Dodgson, J. B., and R. D. Wells. 1977. Synthesis and thermal melting behavior of 
oligomer.polymer complexes containing defined lengths of mismatched dA.dG and dG.dG 
nucleotides. Biochemistry 16:2367-2374. 

7. Wells, R. D., R. W. Blakesley, J. F. Burd, H. W. Chan, J. B. Dodgson, S. C. Hardies, G. T. 
Horn, K. F. Jensen, J. E. Larson, I. F. Nes, E. Seising, and R. M. Wartell. 1977. The role of 
DNA structure in genetic regulation. CRC Crit. Rev. Biochem. 4:305-340. 

8. Engel, J. D., and J. B. Dodgson. 1978. Analysis of the adult and embryonic chicken globin 
genes in chromosomal DNA. J. Biol. Chem. 253:8239-8246. 

9. Dodgson, J. B., J. Strommer, and J. D. Engel. 1979. The isolation of the chicken B-globin gene 
and a linked embryonic B-like globin gene from a chicken DNA recombinant library. Cell 
17:879-887. 

10. Dodgson, J. B., J. Strommer, and J. D. Engel. 1979. The organization of chicken globin genes, 
pp. 383-392. In T. Maniatis and R. Axel (ed.), Eucaryotic gene regulation. Academic Press, 
New York. 

11. Hughes, S. H., E. Stubblefield, F. Payvar, J. D. Engel, J. B. Dodgson, D. Spector, B. Cordell, R. 
T. Schimke, and H. E. Varmus. 1979. Gene localization by chromosome fractionation: Globin 
genes are on at least two chromosomes and three estrogen-inducible genes are on three 
chromosomes. Proc. Nat. Acad. Sci. 76:1348-1352. 
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12. Engel, J. D., and J. B. Dodgson. 1980. Analysis of the closely linked adult chicken a-globin 
genes in recombinant DNAs. Proc. Nat. Acad. Sci. USA 77:2596-2600. 

13. Perler, F., A. Efstratiadis, P. Lomedico, W. Gilbert, R. Kolodner, and J.B. Dodgson. 1980. The 
evolution of genes: The chicken preproinsulin gene. Cell 20:555-566. 

14. Stalder, J., M. Groudine, J. B. Dodgson, J. D. Engel, and H. Weintraub. 1980. Hb switching in 
chickens. Cell 19:973-980. 

15. Dodgson, J. B., K. C. McCune, D. J. Rusling, A. Krust, and J. D. Engel. 1981. Adult chicken a 
-globin genes, aA and aD: No anemic shock a-globin exists in domestic chickens. Proc. Nat. 
Acad. Sci. 78:5998-6002. 

16. Dolan, M., B. J. Sugarman, J. Dodgson, and J. D. Engel. 1981 . Chromosomal arrangement of the 
chicken 6-type globin genes. Cell 24:669-677. 

17. Engel, J. D., and J. B. Dodgson. 1981. Histone genes are clustered but not tandemly repeated in 
the chicken genome. Proc. Nat. Acad. Sci. USA 78:2856-2860. 

18. Engel, J. D., B. J. Sugarman, and J. B. Dodgson. 1982. A chicken histone H3 gene contains 
intervening sequences. Nature 297:434-436. 

19. Grandy, D. K., J. D. Engel, and J. B. Dodgson. 1982. Complete nucleotide sequence of a 
chicken H2b histone gene. J. Biol. Chem. 257:8577-8580. 

20. Dodgson, J. B., and J. D. Engel. 1983. The nucleotide sequence of the chicken adult a-globin 
genes. J. Biol. Chem. 258:4623-4629. 

21. Dodgson, J. B., S. J. Stadt, O.-R. Choi, M. Dolan, H. D. Fischer, and J. D. Engel. 1983. The 
nucleotide sequence of the embryonic chicken 8- type globin genes. J. Biol. Chem. 258:12685- 
12692. 

22. Dolan, M., J. B. Dodgson, and J. D. Engel. 1983. Analysis of the adult chicken 6-globin. J. 
Biol. Chem. 258:3983-3990. 

23. Engel, J. D., D. J. Rusling, K. D. McCune, and J. B. Dodgson. 1983. Unusual structure of the 
chicken embryonic a-globin gene, 7t\ Proc. Natl. Acad. Sci. USA 80:1392-1396. 

24. Grandy, D. K., J. D. Engel, and J. B. Dodgson. 1983. The chicken H2b gene family, pp. 445- 
455. In Gene Expression, Alan R. Liss, Inc., New York. 

25. Sugarman, B. J., J. B. Dodgson, and J. D. Engel. 1983. Genomic organization, DNA sequence, 
and expression of chicken embryonic histone genes. J. Biol. Chem. 258:9005-9016. 

26. Chang, K. S., W. E. Zimmer, Jr., D. J. Bergsma, J. B. Dodgson, and R. J. Schwartz. 1984. 
Isolation and characterization of six different chicken actin genes. Mol. Cell Biol. 4:2498-2508 

27. Fischer, H. D., J. B. Dodgson, S. Hughes, and J. D. Engel. 1984. An unusual 5' splice sequence 
is efficiently utilized in vivo. Proc. Natl. Acad. Sci. USA 81:2733-2737. 

28. Yoshihara, C. M., M. Federspiel, and J. B. Dodgson. 1984. Isolation of the chicken carbonic 
anhydrase II gene. Ann. NY Acad. Sci. 429:332-334. 
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v 29. Brush, D., J. B. Dodgson, O.-R. Choi, P. W. Stevens, and J. D. Engel. 1985. Replacement 
variant histone genes contain intervening sequences. Mol. Cell Biol. 5: 1307-13 17. 

30. Yamamoto, M., N. S. Yew, M. Federspiel, J. B. Dodgson, N. Hayashi, and J. D. Engel. 1985. 
Isolation of recombinant cDNAs encoding chicken erythroid ot-aminolevulinate synthase. Proc. 
Natl. Acad. Sci. USA 82:3702-3706. 

31. Dodgson, J. B., M. Yamamoto, and J. D. Engel. 1987. Chicken histone H3.3B cDNA sequence 
confirms unusual 3' UTR structure. Nucl. Acids Res. 15:6294. 

32. Grandy, D. K., and J. B. Dodgson. 1987. Structure and organization of the chicken H2B histone 
gene family. Nucl. Acids Res. 15:1063-1080. 

33. Kivela, J., H.-J. Kung, J. Dodgson, and L. Bacon. 1987. Cloning of a putative chicken MHC 
class II alpha chain gene, pp. 119-206. In W. T. Weber and D. L. Ewert (eds.), Progress in 
clinical and biological research, avian immunology, vol. 238. Alan R. Liss, Inc., New York, NY. 

34. Moriarity, D. M, K. J. Barringer, J. B. Dodgson, H. E. Richter, and R. B. Young. 1987. Genomic 
clones encoding chicken myosin heavy-chain genes. DNA 6:91-99. 

35. Stevens, P. W., J. B. Dodgson, and J. D. Engel. 1987. Structure and expression of the chicken 
ferritin H-subunit gene. Mol. Cell. Biol. 7:1751-1758. 

36. Swift, R. W, A. Ridgeway, D. Fujita, J. B. Dodgson, and H.-J. Kung. 1987. B-lymphoma 
induction by reticuloendotheliosis virus: Characteri-zation of a mutated CSV provirus involved 
in c-myc activation. J. Virol. 61:2084-2090. 

37. Yoshihara, C. M., J.-D. Lee, and J. B. Dodgson. 1987. The chicken carbonic anhydrase II gene: 
evidence for a recent shift in intron position. Nucl. Acids. Res. 15:753-770. 

38. Dodgson, J. B., D. L. Browne, and A. J. Black. 1988. Chicken chromosomal protein HMG-14 
and HMG-17 cDNA clones: isolation, characterization and sequence comparison. Gene 
63:287-295. 

39. Lewis, W., J.-D. Lee, and J. B. Dodgson. 1991. Adult chicken ot-globin gene expression in 
transfected QT6 quail cells: evidence for a negative regulatory element in the aD gene region. 
Nucl. Acids Res. 19:5321-5329. 

40. Disela, C, C. Glineur, T. Bugge, J. Sap, G. Stengl, J. Dodgson, H. Stunnenberg, H. Beug, and M. 
Zenke. 1991. v-erbA overexpression is required to extinguish c-erbA function in erythroid cell 
differentiation and regulation of the erbA target gene CAII. Genes & Dev. 5:2033-2047. 

41. Pharr, G.T., J.B. Dodgson, and L. Bacon. 1993. Analysis of B-LB chain gene expression in two 
chicken cDNA libraries. Immunogenetics 37:381-385. 

42. Crittenden, L.B., L. Provencher, L. Santangelo, I. Levin, H. Abplanalp, R.W. Briles, W.E. Briles, 
and J. B. Dodgson. 1993. Characterization of a red jungle fowl by white leghorn backcross 
reference population for molecular mapping of the chicken genome. Poultry Science 72:334- 
348. 

43. Browne, D. L., and J. B. Dodgson. 1993. The chicken chromosomal protein HMG-14a-encoding 
gene is transcribed into multiple mRNA's. Gene 124:199-206. 

44. Levin, I., L.B. Crittenden, and J. B. Dodgson. 1993. Genetic map of the chicken Z chromosome 
using random amplified polymorphic DNA (RAPD) markers. Genomics 16:224-230. 
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45. Pharr, G.T., H.D. Hunt, L.D. Bacon, and J.B. Dodgson. 1993. Identification of class II major 
histocompatability complex polymorphisms predicted to be important in peptide antigen 
presentation. Poultry Science 72:1312-1317. 

46. Levin, L, L. Santangelo,H. Cheng, L.B. Crittenden, and J.B. Dodgson. 1994. An autosomal 
genetic linkage map of the chicken. J. Hered. 85: 79-85. 

47. Levin, L, L.B. Crittenden, and J. B. Dodgson. 1994. Polymorphic DNA mapping using avian 
CR1 element-derived PCR primers. J. Hered. 85: 73-78. 

48. Pharr, G.T., L.D. Bacon, and J.B. Dodgson. 1994. A class I cDNA from SPAFAS line 11 
chickens. Eur. J. Immunogenetics 21: 59-66. 

49. Li, Y. and J.B. Dodgson. 1995. The chicken HMG17 gene is dispensable for cell growth in vitro, 
Mol. Cell Biol. 15:5516-5523. 

50. Cheng, H.H., I. Levin, R.L. Vallejo, H. Khatib, J.B. Dodgson, L.B. Crittenden, and J. Hillel. 
1995. Development of a genetic map of the chicken with markers of high utility. Poultry 
Science 74:1855-1874. 

51. Okimoto, R. and J.B. Dodgson. 1996. Improved PCR amplification of multiple specific alleles 
(PAMSA) using internally mismatched primers. BioTechniques 21:20-26. 

52. Li, Y., J.R. Strahler, and J. B. Dodgson. 1997. Neither HMG-14a nor HMG-17 gene function is 
required for growth of chicken DT40 cells or maintenance of DNasel-hypersensitive sites. 
Nucleic Acids Res. 25:283-288. 

53. Dodgson, J.B., H.H. Cheng, and R. Okimoto. 1997. DNA marker technology: a revolution in 
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A compilation of nucleic acid sequences from E. coll and its phages has been analysed 
for the frequency of occurrence of nearest neighbour base doublets and codons. Several 
statistically significant deviations from random are found in both doublet and codon 
frequencies. The deviations in E. coli also appear to occur in lambda and in the coat 
protein gene of MS2, whereas T4 and other parts of the MS2 genome show different 
sequence properties. These and other findings are discussed in relation to the 
hypothesis that rapidity of translation of mRNAs in the E. coli system is dependent on 
doublet frequency and codon usage patterns. 



Exhibit 2 



mm 



1: Mol Biol Evol. 1985 Jan;2(l): 13-34. 



FULL FINAL TEXT 

QXFO m D J OU 11 IMAUS 



Links 



Codon usage and tRNA content in unicellular and multicellular organisms. 



• Ikemurq T. 



Department of Biophysics, Faculty of Science, Kyoto University, Japan. 



Choices of synonymous codons in unicellular organisms are here reviewed, and 
differences in synonymous codon usages between Escherichia coli and the yeast 
Saccharomyces cerevisiae are attributed to differences in the actual populations of 
isoaccepting tRNAs. There exists a strong positive correlation between codon usage and 
tRNA content in both organisms, and the extent of this correlation relates to the protein 
production levels of individual genes. Codon-choice patterns are believed to have been 
well conserved during the course of evolution. Examination of silent substitutions and 
tRNA populations in Enterobacteriaceae revealed that the evolutionary constraint 
imposed by tRNA content on codon usage decelerated rather than accelerated the 
silent-substitution rate, at least insofar as pairs of taxonomically related organisms were 
examined. Codon-choice patterns of multicellular organisms are briefly reviewed, and 
diversity in G+C percentage at the third position of codons in vertebrate genes— as well 
as a possible causative factor in the production of this diversity-is discussed. 
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ABSTRACT 

A simple, effective measure of synonymous codon usage bias, the Codon 
Adaptation Index, Is detailed. The Index uses a reference set of highly 
expressed genes from a species to assess the relative merits of each codon, 
and a score for a gene Is calculated from the frequency of use of all codons 
In that gene. The Index assesses the extent to which selection has been 
effective In moulding the pattern of codon usage. In that respect It is 
useful for predicting the level of expression of a gene, for assessing the 
adaptation of viral genes to their hosts, and for making comparisons of 
codon usage in different organisms. The Index may also give an approximate 
indication of the likely success of heterologous gene expression. 

INTRODUCTION 

The determination of the DNA sequences of a large number of genes from 
a vide variety of species has revealed that, In a large proportion of cases, 
the alternative synonymous codons for any one amino acid are not used 
randomly (1, and references therein). Further, it has been noted that a part 
of this nonrandom usage is species, or rather taxon, specific (2). However, 
within species there is considerable heterogeneity between genes, and in the 
two best studied organisms, namely Escherichia coll and the yeast 
Sac char omyces cerevlslae , there Is a clear positive correlation between 
degree of codon bias and level of gene expression (3,4). Examination of 
large data sets from these species reveals that within species differences 
are largely in the degree rather than the direction of codon usage bias 
(5,6). 

For many reasons it is desirable to quantify the degree of bias in 
codon usage in each gene In such a way that comparisons can be made both 
within and between species. One approach to this problem is to devise a 
measure for assessing the degree of deviation from a postulated Impartial 
pattern of usage. The codon preference bias proposed by McLachlan et al. (7) 
is such a measure. Recently Sharp et al. (5) have proposed to calculate the 
chl square value for the deviation from random codon usage and then scale 
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the value by the gene length (number of codons) so that comparisons can be 
made between genes. 

Another approach is to assess the relative merits of different codons 
from the viewpoint of translational efficiency. For example, Ikemura (1,8,9) 
has identified certain "optimal" codons in E.coli and yeast which are 
expected to be translated more efficiently than others, and calculated the 
frequency of optimal codons in a gene. The codon bias index of Bennetzen and 
Hall (4), for use with yeast genes, is essentially similar. Such indices are 
certainly useful, but have several disadvantages. First, some amino acids 
are usually excluded because it is not clear which codons are "optimal". 
Second, all codons considered are classified into only two categories, i.e., 
optimal and nonoptlmal, with no recognition that some codons within each 
category are better than others. Third, there is no good basis for 
comparison between species because the proportional division of the codon 
table into the two categories may differ; e.g., Ikemura (1) identified 21 
optimal codons for 14 amino acids in E.coli . and 19 optimal codons for 13 
amino acids in yeast. 

Gribskov et al. (10) have recently proposed another Index, the codon 
preference statistic . This statistic is based on the ratio of the likelihood 
of finding a particular codon in a highly expressed gene to the likelihood 
of finding that codon in a random sequence with the same base composition 
as that in the sequence under study. They show that the statistic is useful 
for locating genes in sequenced DNA, for predicting the relative level of 
their expression, and for detecting sequencing errors. However, the statistic 
is not normalized and therefore the values for two genes encoding proteins 
with different amino acid compositions can be quite different even if both 
genes use only the "best" codons. 

With various purposes in mind we have devised a new index. It is 
similar to the codon preference statistic but is normalized so that it is 
convenient for making comparisons both within and between species. After 
describing the index, we show some rather varied applications and indicate 
certain advantages over other indices. In recognition of the role of natural 
selection in producing high levels of codon bias, we call this statistic the 
Codon Adaptation Index . 

METHODS 

We recognize that even in E.coli and yeast the factors determining the 
frequency of synonymous codon usage are not completely understood, but that 
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several points are clear: the pattern of codon usage in any particular gene 
is largely determined by natural selection and mutation (5,6); selection 
appears to occur via translational efficiency, so that synonymous codon 
usage in highly expressed genes is under the strongest selective constraints 
(4,8,9); in E.coll and yeast, very highly expressed genes appear to have the 
greatest degree of synonymous codon bias (3-6,8). From these points it is 
deduced that the pattern of codon usage in very highly expressed genes can 
reveal (1) which of the alternative synonymous codons for an amino acid is 
the most efficient for translation, and (ii) the relative extent to which 
other codons are dlsavantageous . 

The first step is, then, to construct a reference table of relative 
synonymous codon usage (RSCU) values from very highly expressed genes of the 
organism in question. An RSCU value for a codon is simply the observed 
frequency of that codon divided by the frequency expected under the 
assumption of equal usage of the synonymous codons for an amino acid (5) . 
Thus, 

RSCUy - XtJ [1] 

where X is the number of occurrences of the Jth codon for the ith amino 
acid, ani n^ is the number (from one to six) of alternative codons for the 
ith amino acid. The relative adaptlveness of a codon, w , is then the 
frequency of use of that codon compared to the frequency' of the optimal 
codon for that amino acid: 

w - RSCU / RSCU - X /X [2] 
ij ij imax ij imax 

where RSCU and X are the RSCU and X values for the most 

imax imax 
frequently used codon for the ith amino acid. 

Codon usage data have been compiled previously for 16 5 genes from 

E.coli (6), and for 110 genes from yeast (5). To obtain reference RSCU 

values, we have taken the 27 very highly expressed E.coll genes compiled by 

Sharp and Li (6), which include genes encoding 17 ribosomal proteins, four 

outer membrane proteins and four elongation factors. For yeast a set of 24 

genes has been taken from the high expression group previously identified 

(5). These Include 16 genes encoding ribosomal proteins, one for an 

elongation factor, and seven loci encoding very abundant enzymes. The RSCU 
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Table 1 . Values of RSCU and w for codons in very highly expressed genes from 

E.coli and yeast. 



E.coli 
RSCU w 



Yeast 
RSCU w 



E.coli 
RSCU w 



Yeast 
RSCU w 



Phe UUU 
UUC 

Leu UUA 
UUG 

Leu CUU 

cue 

CUA 
CUG 

He AUU 
AUC 
AUA 

Met AUG 

Val GUU 
GUC 
GUA 
GUG 



0.456 0.296 
1.544 1.000 
0.106 0.020 
0.106 0.020 

0.225 0.042 
0.198 0.037 
0.040 0.007 
5.326 1.000 

0.466 0.185 
2.525 1.000 
0.008 0.003 
1.000 1.000 

2.244 1.000 
0.148 0.066 
1.111 0.495 
0.496 0.221 



0.203 0.113 
1.797 1.000 
0.601 0.117 
5.141 1.000 

0.029 0.006 
0.014 0.003 
0.200 0.039 
0.014 0.003 

1.352 0.823 
1.643 1.000 
0.005 0.003 
1.000 1.000 

2.161 1.000 
1.796 0.831 
0.004 0.002 
0.039 0.018 



Ser UCU 
UCC 
UCA 
UCG 

Pro CCU 
CCC 
CCA 
CCG 

Thr ACU 
ACC 
ACA 
ACG 

Ala GCU 
GCC 
GCA 
GCG 



2.571 1.000 
1.912 0.744 
0.198 0.077 
0.044 0.017 

0.231 0.070 
0.038 0.012 
0.442 0.135 
3.288 1.000 

1.804 0.965 
1.870 1.000 
0.141 0.076 
0.185 0.099 

1.877 1.000 
0.228 0.122 
1.099 0.586 
0.796 0.424 



3.359 1.000 
2.327 0.693 
0.122 0.036 
0.017 0.005 

0.179 0.047 
0.036 0.009 
3.776 1.000 
0.009 0.002 

1.899 0.921 
2.063 1.000 
0.025 0.012 
0.013 0.006 

3.005 1.000 
0.948 0.316 
0.044 0.015 
0.004 0.001 



Tyr 


UAU 


0.386 0.239 


0.132 0.071 


Cys 


UGU 


0.667 


0.500 


1.857 


1.000 




UAC 


1.614 


1.000 


1.868 


1.000 


UGC 


1.333 


1.000 


0.143 


0.077 


ter 


UAA 










ter 


UGA 










ter 


UAG 










Trp 


UGG 


1.000 


1.000 


1.000 


1.000 


His 


CAU 


0.451 


0.291 


0.394 


0.245 


Arg 


CGU 


4.380 


1.000 


0.718 


0.137 




CAC 


1.549 


1.000 


1.606 


1.000 


CGC 


1.561 


0.356 


0.008 


0.002 


Gin 


CAA 


0.220 


0.124 


1.987 


1.000 




CGA 


0.017 


0.004 


0.008 


0.002 




CAG 


1.780 


1.000 


0.013 


0.007 




CGG 


0.017 


0.004 


0.008 


0.002 


Asn 


AAU 


0.097 


0.051 


0.100 


0.053 


Ser 


AGU 


0.220 


0.085 


0.070 


0.021 




AAC 


1.903 


1.000 


1.900 


1.000 




AGC 


1.055 


0.410 


0.105 


0.031 


Lys 


AAA 


1.596 


1.000 


0.237 


0.135 


Arg 


AGA 


0.017 


0.004 


5.241 


1.000 




AAG 


0.404 


0.253 


1.763 


1.000 


AGG 


0.008 


0.002 


0.017 


0.003 


Asp 


GAU 


0.605 


0.434 


0.713 


0.554 


Gly 


GGU 


2.283 


1.000 


3.898 


1.000 




GAC 


1.395 


1.000 


1.287 


1.000 


GGC 


1.652 


0.724 


0.077 


0.020 


Glu 


GAA 


1.589 


1.000 


1.968 


1.000 




GGA 


0.022 


0.010 


0.009 


0.002 




GAG 


0.411 


0.259 


0.032 


0.016 




GGG 


0,043 


0.019 


0.017 


0.004 



Genes used: 

E.coli - 17 rlbosomal protein genes, 4 elongation factor genes, 4 outer 
membrane protein genes, recA , dnaK (data from Ref.6) 

Yeast - 16 ribosomal protein genes, TEF 1, 2 enolase genes, 2 GA-3-PDH 
genes, ADH 1, PGK, pyruvate kinase (data sources given in Ref.5) 
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and w values obtained for very highly expressed genes from E.coll and yeast 
are given in Table 1. 

The Codon Adaptation Index (CAI) for a gene is then calculated as the 
geometric mean of the RSCU values (from Table 1) corresponding to each of 
the codons used in that gene, divided by the maximum possible CAI for a gene 
of the same amino acid composition, i.e., 

CAI - CAI . / CAI [3] 
obs ' max 1 J 

where 

CAI obs - ( II RSCU k ) *• [4] 
k"»l 

where RSCU is the RSCU value for the kth codon in the gene, RSCU is 

k ~ kmax 

the maximum RSCU value for the amino acid encoded by the kth codon in the 

gene, and L Is the number of codons in the gene. 

Note that if a certain codon is never used in the reference set then 
the CAI for any other gene In which that codon appears becomes zero. To 
overcome this problem we assign a value of 0.5 to any X that would 
otherwise be zero. Also, the number of AUG and UGG codons are subtracted 
from L, since the RSCU values for AUG and UGG are both fixed at 1.0, and so 
do not contribute to the CAI. 

As illustration, consider the rpsU gene from E.coll which, excluding 
the initiation codon, comprises 70 codons and has the sequence: 

. CCG . GTA . ATT . AAA . GTA 

For that sequence and from the RSCU values in Table 1: 

1/70 

CAI - (3.288 x 1.111 x 0.466 x 1.596 x 1.111 x ) 

obs 

1/70 

and CAI - (3.288 x 2.244 x 2.525 x 1.596 x 2.244 x .....) 



From these two values and equation [3] we can obtain the CAI value. 
We note that equation [3] is exactly equivalent to: 

CAI - ( II w ) [6] 
k-1 * 
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Table 2 . CAI values for B.coll and yeast genes. 



E.coll 


yeast 


gene 


CAI 


gene 


CAI 


17 RPs 


0.467-0.813 


16 RPs 


0.529-0.915 


rpsU 


0.726 


hlstones 


0.532-0.733 


rpoD 


0.582 






dnag 


0.271 


2u plasmld 


0.099-0.106 


lacl 


0.296 


GAL 4 


0.116 


trpR 


0.267 


PPR 1 


0.114 


lpp 


0.849* 


GPD 1 


0.929* 


hsdS 


0.218° 


mat A2 


0.098° 











RPs - rlbosomal protein genes. 

a highest CAI value among data set. 

b lowest CAI value among data set. 



where w is the w value for the kth codon in the gene (see equation [2]). 
k 

Therefore, for rpsU : 



CAI - (1.00 x 0.495 x 0.185 x 1.000 x 0.495 x 



1/70 



Equation (6] saves computation time. To overcome real number underflow 
problems in computer calculations, equation (6] can be computed as: 

L 



CAI - exp 
or from a codon usage table: 
CAI - exp 



i In 



L k-1 



18 n 1 
£ 3 X, J In w 



L i-1 J-l 



tj 



"ij 



17] 



[8] 



where and are as defined in equation [1J. 

There is no intrinsic effect of gene length (L) on CAI, but CAI values 
from short genes may be more variable due to sampling effects. 

APPLICATIONS and DISCUSSION 

Predicting levels of gene expression within a species . 

CAI values clearly parallel levels of gene expression. Ribosomal 
protein genes are highly expressed, and have generally high CAI values 
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1.0 




T7 



ML. 



.4 



CAI 



i.O 



Figure 1 , Distribution of CAI values for (a) 106 yeast genes, (b) 165 
E.coll genes, and (c) 50 bacteriophage T7 genes. In (a) and (b) 
ribosomal protein genes are cross-hatched. Plasmid genes are excluded. 



(Table 2, Figure 1). Among yeast ribosomal protein genes only that encoding 
S33 has a CAI < 0.6, and it is a very short gene (L - 65). Lowly expressed 
regulatory genes (e.g., lad , trpR in E.coll ; GAL 4 , PPR 1 in yeast) have 
low CAI values (Table 2). In E.coll the relationship between codon bias and 
gene expression is perhaps best Illustrated by considering operons (as 
suggested by Gouy and Gautier, Ref.3). For example, within the macro - 
molecular synthesis operon the expression levels are rpsU » rpoD » dnaG 
(11), and the CAI values for these genes are 0.726, 0.582 and 0.271, 
respectively (Table 2) . Eight of the nine genes of the unc operon encode the 
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Table 3 . CAI values for genes in the unc operon of E.coll . 



Pos 


Gene 


CAI 


L 


Gene Product 
name amount sector 


1 


papl 


0.238 


127 




77 




2 


papD 


0.400 


253 


chi 


1 




3 


papH 


0.583 


71 


omega 


10 


F o 


4 


papF 


0.482 


152 


psl 


2 




5 


papE 


0.374 


169 


delta 


1 




6 


papA 


0.665 


501 


alpha 


3 




7 


papC 


0.403 


273 


gamma 


1 


F l 


8 


papB 


0.650 


444 


beta 


3 




9 


papG 


0.474 


133 


epsllon 


1 





Pos : gene position within the operon (1 - 5'). 
The relative amount of each gene product in the ATPase 
complex is taken from Ref.12. 



eight subunlts of the and F^ sectors of the H -ATPase complex, and the 
stolchiometry of these subunlts is known (12): The CAI value is clearly 
correlated with the level of gene expression among the genes encoding 
subunits of the F^ sector (Table 3), with the CAI values for papA and papB 
being similar, and much higher than those for papE , papC and papG . Among 
genes encoding subunits in the F^ sector the rank order of CAI values 
corresponds to the relative amounts of the gene products required. The CAI 
for papH is perhaps surprisingly low, but this is a very short gene (Table 
3). The function of papl is unknown. The CAI value for papl is very low, and 
may indicate that this is a regulatory gene, or perhaps (see below) a 
noncodlng open reading frame. 

Although many of the measures of codon bias discussed in the 
Introduction seem to be positively correlated with gene expression, we feel 
that CAI has the twin advantages of being simple to calculate and making 
greater quantitative use of available information (see 'Comparison of CAI 
with other indices 1 below) . 

The positive correlation between degree of synonymous codon bias and 
expression level in E.coll (and yeast) seems firmly established, but the 
causal relationship between the two has been debated. We have concluded 
elsewhere (6) that the degree of codon bias reflects the past action of 
natural selection it is indicative of the level at which the gene is 
expressed, rather than dictating that level. This seems to concur with 
conclusions drawn from a theoretical model of the translation process (13). 
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Table 4 . CAI values for mammalian genes using E.coll and yeast RSCU values. 



Heterologous gene 


Host 






E.coli 


Yeast 


Human alpha Interferon 


0.218 


0.099 


Human insulin 


0.307 


0.043 


Human growth hormone 


0.287 


0.082 


Human factor VIII 


0.205 


0.114 


Human factor IX 


0.263 


0.176 


Bovine chymosln 


0.326 


0.086 



Predicting levels of heterologous gene expression . 

There is experimental evidence that certain codons can affect 
expression level (14-17). For example, the AGG codon markedly affects the 
translation rate of genes in E.coli (14,15). This suggests that for a 
heterologous gene to have a maximal level of expression its codon usage must 
correspond to that of the host. By using the RSCU values of potential hosts 
to calculate CAI values for a heterologous gene it should be possible to 
predict how well suited that gene would be to the translational systems of 
those hosts. In Table 4 the CAI values of some genes of blotechnological 
interest are given for two different potential hosts, E.coli and yeast. In 
each case these mammalian genes seem better 'adapted' to E.coli . suggesting 
that high expression might be more easily obtained in that system. Of 
course, in reality, the choice of host would probably depend on other 
practicalities. The CAI would, however, suggest whether it is likely to be 
either necessary or of any benefit to chemically synthesize a new gene, to 
include more appropriate codons. It should be stressed that the CAI is only 
an approximate indication of the suitability of the codon usage within a 
gene. For example, It takes no account of the distribution of codons along 
the gene, yet theoretical considerations suggest that this may be very 
important (18). 

A measure of evolutionary adaptedness . 

Under certain natural circumstances foreign genes are expressed in host 
organisms. Viral genes are an obvious example. Codon usage in the many 
bacteriophages which do not encode their own tRNA molecules should be 
adapted to the translational machinery of the host. Then the CAI, using host 
RSCU values, is an estimate of the degree of adaptation. For example, 
comparison of the pattern of codon usage In the genes of bacteriophage T7 
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Table 5 . CAI values for homologous genes from E.coli and T7. 



E.coli 


CAI 


T7 


CAI 


gene 




gene 




ssb 


0.605 


2.5 


0.573 


dnaG 


0.271 


4 


0.301 






5 


0.341 


polA 


0.391 


6 


0.387 



with the relative abundance of cognate tRNA molecules In E.coli (considered 
to be the usual host of T7) suggests that T7 genes are not so well adapted 
as E.coli ' s own genes, although there is clearly some adaptation (19,20). 
This seems to be confirmed by contrasting the distribution of CAI values for 
T7 genes with those of E.coli (Figure 1). However, the difference seen in 
Figure 1 could arise in part because the genes contrasted encode different 
products; for example, T7 encodes no rlbosomal proteins. It has been 
reported that four genes in T7 are homologous to three E.coli genes (21). A 
comparison of these genes (Table 5) is not conclusive, because only ssb is 
highly adapted in E.coli , although In that case the T7 gene does have a 
lower CAI. The four T7 genes as a group do not seem to be significantly less 
adapted than the three E.coli genes. 

In cases where it has not been clear which organism represents the 
major host for a virus it may prove informative to calculate CAI values with 
the different RSCU values of potential hosts. For example, despite 
approximately 65% DNA homology between 0X174 and G4, the genomes of these 
two "coliphages" show a remarkable difference with respect to the frequency 
of the recognition sites of enterobacterial restriction enzymes (22). While 
0X174 (as well as several other coliphages) has a significant avoidance of 
these sites, presumably reflecting adaptation to infecting E.coli , G4 does 
not. However, CAI values for the 10 genes of 0X174 and G4 are very similar, 
suggesting that the patterns of codon usage of the two phages are adapted 
(to E.coli ) to equivalent extents. 

Natural foreign gene expression would also occur if genes undergo 
horizontal transfer. Felmlee et al. (23) have discussed a possible example. 
They reported the DNA sequence of a region of the E.coli chromosome encoding 
four hemolysin genes, and found that their base composition and codon usage 
are atypical of that species. This, together with the observation that these 
genes are found in only a limited number of E.coli strains, was taken as 
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evidence that the genes represent a recent acquisition to this species (23). 
The CAI values for these genes are indeed very low, ranging from 0.202 to 
0.243. These values are lover than those for nearly all other E.coll genes 
(see Figure l t in which the hemolysin genes are not included); including 
some (e.g., araC and dnaG ) which are expressed at very low levels. Hemolysin 
is an extracellular protein and would be expected to be expressed at much 
higher levels than araC or dnaG , so these low CAI values suggest that the 
hemolysin genes are not well adapted to E.coll , and seem to confirm the 
suggestion of a recent acquisition. If reference RSCU data were available 
for a variety of organisms from which the genes could have been transferred, 
it might be possible to determine the most likely source by comparison of 
CAI values. 

If plasmlds were regularly subject to interspecific transfer, then 
their genes might not become adapted to any one host. Genes on E.coli 
plasmlds tend to have less codon bias than chromosomal genes (3). Ve note 
that the three genes of the yeast 2 micron plasmid have very low CAI values 
(Table 2). 

Synonymous codon usage and the rate of molecular evolution . 

A major prediction of the neutral theory of molecular evolution (24) is 
an inverse relationship between the rate of evolution and the degree of 
selective constraint, i.e., the stronger the constraint the slower the rate 
of molecular evolution. Indeed, a great deal of evidence confirms this, 
including the observation that pseudogenes, which are under no apparent 
constraint, are the fastest evolving DNA sequences (25). That synonymous 
substitutions in protein coding genes occur at a slower rate than 
substitutions in pseudogenes (26,27) Implies that there are selective 
constraints on the former. If the differences between genes in degree of 
codon usage bias largely reflect differences in selection pressure on- 
synonymous codons, then the rate of synonymous substitution would be 
Inversely related to the degree of codon bias. The CAI can be used to 
quantify this relationship. Comparisons of E.coll and Salmonella typhlmurlum 
genes do indeed show a significant negative correlation between the rate of 
synonymous substitution and the CAI (28). 
Comparison of codon usage In different organisms . 

Meaningful comparisons of codon usage in different organisms can be 
made if care is taken in defining the reference set of genes from which the 
RSCU values are calculated. The reference sets we have chosen for E.coll and 
yeast comprise very similar collections of genes, yet the distribution of 
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CAX values for genes from these two organisms are rather different. Very 

highly expressed genes in yeast have on average a more extreme codon bias 

than their counterparts in E.coli . as seen for example with ribosomal 

protein genes (Table 2). The reference set of RSCU values reflects this, and 

so the genes with least codon usage bias in yeast have lower CAI values than 

genes in E.coli . as a result. It is particularly interesting to note that 

cluster analysis of yeast genes based on their synonymous codon usage 

clearly differentiates two groups, identified as comprising highly and 

moderately/lowly expressed genes (5) , and that those two groups correspond 

almost exactly to the bimodal distribution of CAI values for yeast genes in 

Figure 1. By contrast, cluster analysis does not so easily differentiate 

highly and lowly expressed genes in E.coli or in T7 (5) and the 

distributions of CAI values from those organisms are unimodal (Figure 1). It 

is not clear why selection has apparently been more successful in producing 

high codon bias in yeast than in E.coli . Li (29) has shown that the 

effectiveness of selection in maintaining synonymous codon bias depends 

largely on the strength of selection and effective population size. It could 

be that the strength of selection is stronger in yeast than in E.coli 

because the required amount of certain gene products, such as ribosomal 

proteins, is larger. It is also possible that the effective population size 

13 larger in yeast than in E.coli because the latter has a largely clonal 

population structure (30). 

We note that comparisons between species can be difficult when the 

reference sets of genes have quite different levels of bias in codon usage. 

For example, very highly expressed genes have a much lower bias in codon 

usage in Bacillus subtllls than in E.coli or yeast (Shields and Sharp, in 

prep.). Then, in B. subtllls . there are few codons with very low w values. 

As a consequence, CAI values for other genes in B.subtllus are, on average, 

higher than those seen in the other species, even though the B.subtllus 

genes have clearly less bias. The CAI given by equation [4] is less 

obs 

affected by this difference in the reference set, and may form a better 
basis for comparison between species under these circumstances. 
Identification of protein- coding reading frames . 

Several of the indices of codon usage bias were originally devised in 
order to ascertain the likelihood that open reading frames are indeed 
protein- coding. As with the other measures, the CAI should be useful in this 
context, particularly in locating genes of moderate to high expression. 
However, some of the points outlined above Indicate that difficulties may 
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arise In interpreting low CAI values. Thus, while a high CAI Is probably a 
good Indication that a reading frame is protein- coding, a low CAI may 
Indicate a gene of low expression, a gene of heterologous origin (as with 
the hemolysin genes) , or a noncoding region that happens to contain no 
termination codons. The CAI value expected for a random sequence can easily 
be calculated, but a relatively high value for a noncoding sequence may 
arise simply because DNA is not a random sequence of nucleotides, or because 
there is a coding sequence on the complementary strand (31). For example, an 
E.coli gene with no UUA, CUA or UCA codons, but otherwise having the typical 
codon composition of a nonhlghly expressed gene (6), would give rise to an 
in phase open reading frame on the complementary strand with a CAI of 
approximately 0.28, which is similar to the lower values seen for E.coli 
genes (Figure 1) and somewhat higher than the value (about 0.17) expected 
for a random sequence. 
Comparison of CAI with other indices . 

The CAI is a very simple measure of the extent of synonymous codon 
usage bias, specifically in the direction of the bias seen in highly 
expressed genes. It has the advantage, compared with indices which measure 
only the frequency of certain optimal codons, of taking account of all 59 
codons where synonymous alternatives exist, each In a quantitative manner. 
For example, both the codon bias index (4) and the frequency of optimal 
codons (1) treat GCU and CCC equally, as preferred codons for Ala in yeast, 
and yet the frequency of GCU is approximately three times that of GCC in 
very highly expressed genes (Table 1). With heterologous gene expression in 
mind it may be of primary Importance to know the frequency of particularly 
disadvantageous codons In a gene. Simpler Indices compound these very rare 
codons with others not in the ' optimal • category. Thus in E.coli AUA and AUU 
are treated equally (1) , despite their very different frequency of use (see 
Table 1, and Ref.6). Again the CAI takes account of these differences 
quantitatively. 

The codon preference statistic (10) is similar but not Identical to the 

CAI given by equation [4] . One difference is that in calculating the 
obs 

codon preference statistic the p values (analagous to RSCU in equation [4]) 
are adjusted to take account of base composition. Another difference is that 
the CAI value is scaled to allow for the different amino acid compositions of 
different proteins (see equation [3J), and has a range from 0 - 1.0. 
Although this scaling cannot completely compensate for differing amino 
acid compositions, It facilitates comparisons between genes. 
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Our discussion of the use of the Codon Adaptation Index has focus sed on 
unicellular organisms because the determinants of codon usage in 
multicellular organisms are not well understood (1). For example, it appears 
that the mammalian genome comprises regions of quite different G+C content 
(32), and that local G+C content Is an important influence on codon usage in 
any one gene (1) . Also tRNA abundancles are important selective constraints 
on codon usage, and in multicellular organisms tRNA populations vary among 
tissues. We also note that the only mammalian ribosomal protein genes for 
which DNA sequence data are available (two from mouse and two from rat -- 
see Ref.33) do not seem to show particularly high synonymous codon bias. It 
may be possible in the near future to derive a reference set of RSCU values 
from other highly expressed mammalian genes, and/or It may prove necessary 
to take into account the tissue in which the gene is expressed, for example 
by having several reference sets. 
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ABSTRACT 

The frequencies of each of the 257 468 complete protein 
coding sequences (CDSs) have been compiled from 
the taxonomical divisions of the GenBank DNA 
sequence database. The sum of the codons used by 
8792 organisms has also been calculated. The data 
files can be obtained from the anonymous ftp sites of 
DDBJ, Kazusa and EBI. A list of the codon usage of 
genes and the sum of the codons used by each 
organism can be obtained through the web site http:// 
www.kazusa.or.jp/codon/ . The present study also 
reports recent developments on the WWW site. The new 
web interface provides data in the CodonFrequency- 
compatible format as well as in the traditional table 
format. The use of the database is facilitated by 
keyword based search analysis and the availability of 
codon usage tables for selected genes from each 
species. These new tools will provide users with the 
ability to further analyze for variations in codon 
usage among different genomes. 

DESCRIPTION 

We have been compiling the codon usage of all the full-length 
protein gene entries in the international DNA sequence databases. 
The compiled files are now freely available through the 
internet. The purpose of the database designated CUTG is to 
provide an electronic dataset for codon usage-based analyses. 
CUTG consists of lists of the codon usage of genes and the sum 
of codon use by each organism. As of September 1999, CUTG 
will contain 257 468 genes from 8792 organisms. The database 
has been constructed from the nucleotide sequences obtained 
from the latest major release of the GenBank sequence data- 
base (1). The strategy used for data collection can be examined 
by following the URLs listed in the following section or by 
studying the supplementary material accompanying this 
publication in NAR Online. 

AVAILABILITY 

The authors recommend that the database be accessed through 
the WWW server at Kazusa DNA Research Institute, which 



provides a user-friendly interface for interactive access: http:// 
www.kazusa.or.jp/codon/ 

The database displays codon usage in a format compatible 
with that of CodonFrequency output in the GCG Wisconsin 
Package™. Thus, users who have the GCG package in their 
local environment can do further analyses with the files generated 
by the database. Also, for each species there is a new query box 
to search for information in the comments of each gene. The 
user can choose complete protein coding sequences (CDSs) by 
keyword and then make codon usage tables from the selected 
genes. This tool provides users with the ability to analyze for 
intra-species variation in codon usage. For example, it has 
been reported that protein production levels can be predicted 
from the complete genome sequences of microbes using the 
codon usage biases compiled from ribosomal protein genes (2). 

The complete dataset of CUTG is available through the 
following URLs: 
Kazusa ftp://ftp.kazusa.or.jp/pub/codon/current/ 
DDBJ ftp://ftp.nig.ac.jp/pub/db/codon/current/ 
EBI ftp://ftp.ebi.ac.uk/pub/databases/cutg/ 
In August 1999, the construction and primary distribution 
site of the database was moved to Kazusa DNA Research Institute 
from the DNA Information and Stock Center. Descriptions of 
the files are maintained as README files through the URLs. 

SUPPLEMENTARY MATERIAL 

See Supplementary Material available at NAR Online. 
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QUERY Box for search with Latin name of organism 



Case: ® sensitive ©insensitive 



Submit Clear 



Input a scientific name (or its regular expression) for an organism and press "Submit" or return 
key. Use Latin name such as "Marchantia polymorpha", "Saccharomyces cerevisiae" etc., not 
"liverwort", "yeast" etc. 

Alphabetical lists of all organisms 
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Others (intials are not capital) 
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Additional Service 

Countcodon program: compilation a sequence into a codon usage table 
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G 



GB virus C variant troglodytes [gbvrl]: 1 

GR mouse mammary tumor virus [ gbvrl]: 1 

Gadus m orhu a [gbvrt]: 23 

Gaeumannomyces graminis [ gbpln]: 1 

Gaeumannomyces graminis var. graminis [gb pln]: 1 

Gagea lutea [gbpln]: 1 

Galactomyces geotrichum [gbpln]: 4 

Galago senegalensis [gbpri]: 1 

Galanthus nivalis [gb pln] : 1 

Galdieria parti ta [gbp ln]: 2 

Galdieria sulphur aria [ gbpln]: 2 

Galega orientalis [gbpln]: 1 

Galeus me l astomus [ gbvrt]: 1 

Galinsoga mosaic carmovirus [ gbvrl]: 5 

Galleria mellonella [ gbinv]: 13 

Galler ia mellonella densovirus [gbvrl] : 4 

Galleria mellonella nuclear polyhedrosis virus [gbvrl]: 2 

Gallid herpesvirus 1 (serotype 1) [gbvrl]: 5 

Gallid herpesvirus 1 (serotype 2) [gbvrl]: 88 

Gallid herpesvirus 2 [g bvrl]: 152 

Gallid herpesvirus 3 [gbvrl]: 1 

Gallus [ gbv rt]: 1 

Gallus gallus [gbvrt]: 1782 

Gallus sp. [gbvrt]: 4 

Gambusia affinis [gbvrt]: 1 

Ganode rma app lanatum [gbpln]: 1 

Ganoderma lucidum [gbpln]: 1 

Ganoderma microsporum [gbpln]: 1 

Garcinia mangostana [g bpln] : 3 

Gardner-Arnstein feline leukemia oncovirus B [gbvrl]: 3 

Gardnerella vaginalis [gbbct] : 2 

Garlic mite-bor ne mosaic virus [gbvrl]: 3 

Garlic mosaic virus [gbvrl]: 2 

Gasterophilus intestinalis [gbinv]: 1 

Gastro dia elata [gbpln]: 4 

Gecarci n us later alis [gbinv]: 2 

Gekko gecko [ gbvrt] : 2 

Gelonium multiflorum [gbpln]: 1 

Gentiana lutea [gbpln]: 1 

Gentiana tri flora [gbpln]: 5 

Geobacter metallireducens [gbbct] : 1 

Geochelone carbonaria [gbvrt]: 1 

Geodia cy d onium [g binv] : 35 

Geomvs attwateri [ gbrod]: 1 

Geomys bur sarins [gbrod]: 2 

Geomys bursar i us maj or [gbrodji 1 

Geomys knoxjonesi [gbrod]: 3 
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® Codon Usage Table with Amino Acids 
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Keyword example: ribosomal protein / MAP kinase 
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ABSTRACT We have developed a novel method of 
DNA extraction combined with a high-throughput 
method of gene detection allowing thousands of poten- 
tially transgenic chicks to be screened quickly and reli- 
ably. By using this method and a replication-deficient 
retroviral vector based on avian leukosis virus (ALV), 
we have demonstrated germline transmission of three 



different transgenes. Several generations of chickens car- 
rying intact transgenes were produced, validating the use 
of the ALV retroviral vectors for large-scale production 
of transgenic flocks. Fourth-generation chicks that were 
nontransgenic, hemizygous, or homozygous for the 
transgene were identified with the combined genetic 



screening methods. 

(Key words: chicken, transgenic, replication-deficient retroviral vector, high-throughput screening, DNA extraction) 

2002 Poultry Science 81:202-212 



INTRODUCTION 

An obstacle to avian transgenesis is the low efficiency 
of introducing foreign DNA into the chicken genome. 
Procedures that have worked for other animals are diffi- 
cult, if not impossible, due in part to the unique reproduc- 
tive physiology of the chicken (Love et aL, 1994; Naito et 
al., 1994). New methods, including the use of transposable 
elements, show promise but require additional refine- 
ment before their utility is confirmed (Sherman et al., 
1998). By using retroviral-based vectors, however, several 
research groups have successfully introduced transgenes 
into the chicken genome at low but acceptable efficiencies. 
The first transgenic birds were produced using replica- 
tion-competent retroviral vectors based on avian leukosis 
virus (ALV) (Salter et al., 1987; Crittenden and Salter, 
1989; Petropoulos et al., 1992). Approximately 25% of the 
males that hatched from injected embryos (referred to as 
generation zero or GO) gave rise to transgenic offspring 
(Gl) at frequencies ranging from 1 to 11% of total off- 
spring. Vectors based on reticuloendotheliosis virus 
(REV) have also yielded transgenic chickens (Bosselman 
et aL, 1989; Briskin et al., 1991). These vectors carried the 
transgene flanked by 5' and 3' long terminal repeats (LTR) 
but did not carry the genes required for replication. How- 
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ever, the cell lines used to package the vector gave rise 
to a low level of replication-competent virus, leading to 
some viremia in GO birds (Hu et al., 1987; Bosselman et 
al., 1989). 

Vectors based on ALV have been shown to yield high 
titers without contamination with replication-competent 
virus (Savatier et al., 1989; Cosset et aL, 1990, 1991, 1992). 
The vector shown to produce the highest titers (NL-B; 
referred to as NLB in this paper), contains cis-acting se- 
quences from Rous-associated virus-2, including the LTR 
and encapsidation sequence. A small portion of the gag 
coding sequence, which contains a splice donor site, is in 
frame with the neomycin resistance gene (neo 1 ) and is 
expressed as a GAG-NEO 1 fusion protein from the 5' LTR 
transcript. Downstream of necf is a splice acceptor, lacZ 
and a portion of the env coding sequence. A GAG-LacZ 
fusion protein is expressed from the spliced 5' LTR tran- 
script. 

NLB particles can be produced bearing subgroup A, 
B, C, or E envelope proteins, but subgroup A was deter- 
mined tb be the most useful for transduction of early 
chicken embryos (Thoraval et al., 1995). NLB is packaged 
as transduction particles with subgroup A specificity us- 
ing the Isolde cell line. Isolde cells were produced by 
stable transformation with two separate vectors derived 
from ALV genomes, one expressing the gag-pol genes and 
the other being env A (Cosset et aL, 1990). The gag-pol and 
env expression vectors were engineered to have minimal 



Abbreviation Key: GAPDH = glyceraldehyde-3-phosphate dehydro- 
genase; GO, Gl, etc. = Generation 1, 2, etc.; LTR = long tenrtinal repeat; 
NLB = high-titer vector; PCR = polymerase chain reaction; REV = reticu- 
loendotheliosis virus; Std = standard. 
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overlap with each other and with NLB to minimize possi- 
ble recombination and subsequent reformation of intact 
retroviral genomes. The RNA encapsidation signal, psi, 
was also deleted from the gag-pol and env expression 
vectors to prevent packaging of aberrant recombinant 
transcripts, if they occurred, into transduction particles. 

To demonstrate the production of transgenic chickens 
with the NLB vector, Thoraval et al. (1995) transduced 
unincubated Brown Leghorn embryos with NLB pseu- 
dotyped with envA. Of 16 males that hatched, one GO 
male harbored the transgene in his sperm DNA. Of 220 
chicks bred from this male, four carried the intact 
transgene. The transgene in Gl birds was passed on to 
the next generation (G2), and the genes encoded by NLB, 
neo T and lacZ, were both expressed in G2 embryos. 

We sought to further validate the ALV retroviral vector 
system for the production of transgenic chickens. We also 
wanted to develop procedures that would identify 
transgenic offspring rapidly with less labor-intensive 
methods. We describe in this paper the production of 
three flocks that harbored stably integrated transgenes 
that were transmitted through several generations. We 
describe methods for high-throughput DNA extraction 
and transgene detection to facilitate the identification of 
transgenic founders and their progeny. 

MATERIALS AND METHODS 

High-Throughput Extraction 
of DNA from Avian Blood 

Lysis buffer 1 (0.1 to 0.3 mL; 0.32 M sucrose, 10 mM 
Tris-Cl, 5 mM MgClj, and 1% Triton X-100, pH 7.5) was 
added to the wells of a 96-well, flat-bottom plate (Falcon 
cat. no. 353072 4 ). Chicks were usually bled 3 to 10 d post- 
hatch. The leg vein was pricked with a pointed scalpel 
blade, and blood was collected immediately with a hepa- 
rinized capillary tube (cat. no. 22-362-566 5 ). For chicks 
older than 10 d, a wing vein was pricked. The blood was 
transferred into one well containing lysis buffer 1 and 
mixed using the capillary. Nuclei were collected by cen- 
trifugation at 960 x g for 7 min in a tabletop centrifuge 
equipped with swinging bucket carriers for multiwell 
plates; the supernatant was carefully aspirated using a 
micropipet tip. The same tip could be used repeatedly 
without detectable cross-contamination. Next, 0.05 mL of 
lysis buffer 2 [10 mM Tris-Cl, 10 mM NaCl, 10 mM EDTA, 
and 1 mg/mL proteinase K (cat. no. V3021 6 ), pH 8.0] was 
added to each well and the covered plate was incubated 
for 2 h at 56 to 65 C. The plates were cooled to room 
temperature and 1.5 fxL of 5 M NaCl and 0.1 mL ethanol 
were added to each well. The plate was allowed to stand 
overnight at 4 C. Supernatants were removed by inverting 



4 Becton Dickinson Labware, Franklin Lakes, NJ 07417. 

5 Fisher, Norcioss, GA 30091. 

6 Promega, Madison, WI 53711. 

7 Sigma Chemical Co., St. Louis, MO 63178-9916. 



the plate and pouring into a large beaker. Precipitates 
were washed with 0.2 to 0.3 mL of 70% ethanol with 
supernatants removed as before. DNA were air dried to 
transparency at 65 C for 1 h, and 0.2 mL polymerase chain 
reaction (PCR)- or DNA-grade H 2 0 was added to each 
well. A sheet of Parafilm was placed over the wells and 
the lid was placed tightly on top of the Parafilm. DNA 
were incubated overnight at 4 C, followed by gently shak- 
ing at the lowest speed on a vortexer with a microplate 
holder at room temperature for 2 to 24 h. 

Phenol-Base Extraction of Avian Blood 

This protocol was adapted from Current Protocols in 
Molecular Biology (Ausubel et al., 1999). Blood was col- 
lected from a wing vein of birds 10 d or older with a 25- 
ga needle and a 1-cc syringe primed with 0.05 mL of 
heparin (cat. no. 210-6 7 ); 0.3 mL was transferred to an 
eppendorf tube. The sample was centrifuged for 5 min 
at 500 x g, the supernatant removed, and the pellet 
washed twice with ice-cold PBS. Then 0.6 mL of 100 mM 
NaCl; 10 mM Tris-Cl, pH 8.0; 25 mM EDTA; 0.5% SDS; 
and 0.1 mg proteinase K/mL were added, and the sample 
was gently mixed overnight at 50 C. The tube was centri- 
fuged at 12,000 x g for 10 min, and 0.6 mL of the superna- 
tant was transferred to a fresh tube and extracted 5 times 
with an equal volume of phenol:chloroform:isoamyl alco- 
hol (25:24:1). Each time, the sample was gently mixed for 
5 min and centrifuged at 12,000 x g for 10 min. After 
transferring the aqueous phase to a 15-mL conical tube 
containing 0.3 mL of 7.5 M ammonium acetate and 1.2 
mL 95% ethanol, the sample was mixed until a precipitate 
formed and then centrifuged at 1,700 x g for 5 min. The 
pellet was washed three times each with 2 mL of 70% 
ethanol, air-dried, and resuspended in 0.6 mL water. 

Vector Construction 

To efficiently replace the lacZ gene of pNLB with any 
transgene, an intermediate vector, pNLB-Adapter, was 
first created. It was constructed by inserting the Apal 
fragment of pNLB, with 3' overhangs removed (con- 
taining lacZ, gag and the 3' LTR) (Cosset et al., 1991), 
into the Kpnl/Sacl sites, with 3' overhangs removed, of 
pBluescriptKS(-). To create pNLB-Adapter-CMV-BL, 
lacZ was replaced with the cytomegalovirus (CMV) pro- 
moter and the /^-lactamase (bla or BL) gene (in pNLB, 
Kpril is present 67 bp upstream of lacZ and Ndel is present 
100 bp upstream of the lacZ stop codon). A pCMV-BL 
(Moore et al., 1997) Mlul/Xbal fragment, with 3' recessed 
ends filled-in, was then inserted into the Kpril (3' overhang 
removed)/Ndd (3' recessed end filled-in) sites of pNLB- 
Adapter. To create pNLB-CMV-BL, the HindUl/Blpl in- 
sert of pNLB containing lacZ, was replaced with the Hin- 
dm/Blpl insert of pNLB-Adapter-CMV-BL. This two-step 
cloning was necessary for some vectors because direct 
ligation of blunt-end fragments into the HindHL/Blpl sites 
of pNLB yielded mostly rearranged subclones. pNLB- 
OV-CAT was constructed in a similar manner, except that 
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a 1.4-kb fragment of the chicken ovalbumin promoter 
, linked to the chloramphenicol acetyl-transferase (CAT) 
coding sequence was inserted into the Kpnl/Ndel sites of 
pNLB- Adapter, with ends polished as described above. 

Production of Transduction Particles 

Senta and Isolde cells were cultured in Ham's F10 me- 
dium (cat. no. 81200-040 8 ), 5% newborn calf serum (cat. 
no. 16010-159 8 ), 1% chicken serum (cat. no. 16110-082 8 ), 
50 Mg/mL phleomycin (cat. no. PHLEP0250 9 ), and 50 /xg/ 
mL hygromycin (cat. no. H-3274 7 ). Transduction particles 
were produced as described (Cosset et al., 1993) with the 
following exceptions. Two days after transfection of the 
retroviral vector into 9 x 10 5 Senta cells, particles were 
harvested in fresh media for 6 to 16 h and. then filtered. 
All of the media were used to transduce 3 x 10 6 Isolde 
cells in three 100-mm plates containing 4 /xg polybrene/ 
mL. The following day, fresh media containing 50 xxg 
phleomycin/mL, 50 /xg hygromycin/mL and 200 /xg G418 
(cat. no. G-9516 7 )/mL were added. After 10 to 12 d, single 
G418 r colonies were isolated and transferred to 24-well 
plates. After 7 to 10 d, titers from each colony were deter- 
mined by transduction of Senta cells and G418 selection. 
Typically, 1 of 30 colonies gave titers of 1 to 3 x 10 5 . Those 
colonies were expanded and particles concentrated to 2 
to 7 x 10 6 as described (Allioli et aL, 1994). Packaging 
of intact transgenes was confirmed by assaying for the 
reporter proteins, /?-galactosidase, or /3-lactamase, in 
Senta cells transduced with NLB or NLB-CMV-BL. DNA 
was extracted from Senta cells transduced with NLB- 
OV-CAT, and PCR confirmed the integrity of the OV- 
CAT cassette. 

Production of Transgenic Chickens 

Concentrated particles (7 to 10 /xL) were injected into 
the subgerminal cavity of freshly laid fertilized eggs as 
described (Thoraval et al., 1995), except that the eggs 
were windowed and resealed by the method described 
in Speksnijder and Ivarie (2000). Fertile White Leghorn 
eggs were used for NLB transductions. Specific pathogen- 
free White Leghorn eggs from SPAFAS 10 were used for 
NLB-CMV-BL and NLB-OV-CAT. 

Sperm DNA Extraction 

Sperm DNA was extracted by a method used on foren- 
sic material (Walsh et al., 1991). Semen was collected into 
a Nalgene cryogenic vial (cat. no. 66008-728 11 ) and 3 to 



8 Gibco-BRL, Gaithersburg, MD 20897. 

9 Cayla Laboratories, Toulouse cedex 4, France. 

10 Preston, CT 06365. 

n VWR, Suwanee, GA 30174. 

"Molecular Probes Inc., Eugene, OR 97402. 

13 Tumer Designs, Sunnyvale, CA 94086. 

,4 PE Applied Biosystems, Foster City, CA 94404. 

"Molecular Devices Corp., Sunnyvale, CA 94089. 



10 /xL was mixed with 0.2 mL of 5% Chelex-100 (cat. no. 
C7901 7 ), 0.1 mg proteinase K/mL, and 35 mM DTT (cat. 
no. D9779 7 ) and was incubated at 56 C for 30 to 60 min. 
Samples were vortexed for 5 to 10 s, briefly centrifuged, 
and then incubated at 100 C for 8 min. Samples were 
again vortexed for 5 to 10 s, centrifuged at 12,000 x g for 
2 to 3 min, and stored at 4 C until use. DNA concentration 
was measured using the Picogreen® dsDNA Quantitation 
Kit (cat. no. P-7589 12 ) and a fluorometer (model TD-700 13 ). 

Taqman Reactions 

For detection of the chicken glyceraldehyde-3-phos- 
phate dehydrogenase (GAPDH) gene, PCR primers were 
chGAPDH-1, (5'-TCCCAGATTTGGCCGTATTG-3'), and 
chGAPDH-2, (5'<ZCACTTGGACTrTGCCAGAGA-3'). 
The Taqman probe sequence (chGAPDH probe) was 5'- 
CCGCCTGGTCACCAGGGCTG-3' and was labeled with 
FAM (6-carboxyfluorescin) at the 5' end and TAMRA 
(N,N,N^N'-tetramethyl-6<arboxy^hodamine) at the 
3'end. For detection of the neo r gene, primers used in the 
Taqman reaction were Neofor-1 (5'-TGGATTGCACG- 
CAGGTTCT-30 and Neorev-1 (5'-GTGCCCAGTCATAG- 
CCGAAT-3'). The Taqman probe sequence (Neoprobe) 
was 5MXTCTCCACCCAAGCGGCCG-3' and was la- 
beled with TET (tetracMoro-6-carboxy-fluorescein) or 
FAM (6-carboxyfluorescin) at the 5' end and TAMRA 
(N,N,^N'-tetramethyl-6-carboxyrhodamine) at the 3' 
end. Primers were synthesized by Gibco-BRL 8 and probes 
by PE Applied Biosystems. 14 Reactions were carried out 
in 20 /xL for blood DNA or 50 /xL for sperm DNA in 0.75x 
PCR Buffer (cat. no. N808-0010 14 ), 0.25x Taqman buffer 
(cat. no. N808-0228 14 ), 2.5 mM MgCl^ 5% DMSO, 125 /xM 
dATP, 125 /xM dCTP, 125 /xM dGTP, 250 /xM UTP (cat. 
no. N808-0228 14 ), 0.9 /xM forward primer, 0.9 /xM reverse 
primer, 40 nM Taqman probe, 0.05 units AmpliTaq Gold 
DNA Polymerase//*!, (cat. no. N808-0228 14 ), and 0.004 
units AmpErase UNG//xL (cat. no. N808-0228 14 ). Blood 
DNA (100 to 300 ng) was then added to each reaction 
and analyzed on a Sequence Detector Model 7700 14 under 
the following conditions: 50 C for 2 min, 95 C for 10 min, 
followed by 40 or 50 cycles of 95 C for 15 s, and 60 C for 
1 min. 

Taqman Copy Number Assay 

Reactions were carried out in 50 /xL using primers and a 
probe labeled with FAM specific for tied described above. 
Standards (Std) were performed in triplicate and con- 
tained 2.6, 5.2, 10.4, 20.8, and 41.6 ng of DNA extracted 
by the high-throughput method from NLB-CMV-BL Gl 
4133 for Stdl, Std2, Std3, Std4 and Std5, respectively. The 
number of genomic equivalents per mass of DNA was 
calculated based on 2.6 pg of DNA per diploid genome 
(Davidson, 1965). Genomic DNA from unknown birds 
was extracted by the high-throughput method and quan- 
tified by the Picogreen® assay on a SpectraMax Gemini 
fluorometer. 15 Unknown DNA were diluted to 5 ng//xL, 
and 10 ng was assayed in duplicate. 
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Southern Blot Analysis 

9 * 

Ten micrograms of genomic DNA was digested with 
Pstl or Hindlll according to the manufacturer's instruc- 
tions, 16 electrophoresed on a 0.8% agarose gel for 16 to 
20 h at 1.5 V/cm, and transferred to Genescreen Plus 
membranes (cat no. NEF 486 17 ) according to the manufac- 
turer's instructions [see also (Southern, 1975)]. A gel-puri- 
fied 0.9-kb fragment from the neo r gene was labeled with 
the Multiprime DNA labeling system (cat no. RPN1601 18 ) 
and 3,000 Ci/mmol [a- 32 P]dCTP (cat. no. PB-10205-250 18 ). 
Membranes were probed and washed according to the 
manufacturer's instructions in the Genescreen Plus man- 
ual. Membranes were visualized with Biomax MS-1 film 
(cat. no. 8294985 19 ) and a Biomax transcreen-HE intensi- 
fying screen (cat. no. 108517B 19 ) for 4 to 24 h. 

RESULTS 

High-Throughput Extraction 
of DNA from Chicken Blood 

Methods to extract DNA from a large number of murine 
embryonic stem cell colonies (Ramirez-Solis et aL, 1992; 
Udy and Evans, 1994) were modified for the extraction 
of avian blood. Blood cells from 3-to-10-d-old chicks were 
lysed in each well of a 96-well plate. The DNA was precip- 
itated, washed, and resuspended as described in materi- 
als and methods. Typically, 1 fiL of the resuspended DNA 
would contain 20 to 300 ng of genomic DNA. The average 
for a typical experiment was 127.1 ng//xL ± 46.2 ng//xL 
(n = 40). Therefore, 1 to 2 fiL could be used reliably for 
detection of the transgene in a PCR-based assay. When 
compared to a method employing standard phenol ex- 
traction, the qualities of the DNA were similar (Figure 
1A). The DNA could be stored for extended periods at 4 
C in the plates and used in PCR-based detection assays 
and for Southern analysis. 

Genetic Screening with Real-Time PCR 

We chose the ABI PRISM® 7700 Sequence Detection 
System (7700 SDS) and Taqman assay to screen for 
transgenes in a large number of samples because it met 
several criteria: high-throughput detection, easy set-up, 
no radioactivity, low cost per sample, 96-well format, 
high sensitivity and speed (Heid et al., 1996). The system 
utilizes the 5'-»3' exonuclease activity of Thermus aquat- 
icus DNA polymerase (Holland et al., 1991). During a 
Taqman reaction a fluorogenic probe, comprising an oli- 
gonucleotide with reporter and quencher fluorescent dyes 
attached, anneals specifically between forward and re- 
verse primers designed to amplify the sequence of inter- 
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FIGURE 1. Utility of a high-throughput method for extraction of 
chick blood DNA. (A) DNA were extracted from White Leghorn blood 
cells using a phenol-based method (lanes marked P) and the high- 
throughput method (lanes marked H) and 1, 2, and 5 /ig of each were 
separated on a 0.8% agarose gel. Sizes are indicated to the left in kilo- 
bases. (B) Taqman reactions with primers and a FAM/TAMRA-modi- 
fied oligonucleotide probe complementary to the chicken glyceralde- 
hyde-3-phosphate dehydrogenase (GAPDH) gene was used to confirm 
the reliability of the high throughput DNA extraction method. 4Rn is 
the increase of relative fluorescence due to amplification of the GAPDH 
sequence. Three overlapping curves that show no increase in ZiRn are 
blank reactions. Curves showing increases in AKs\ are reactions to which 
we added 2 /xL of 21 White Leghorn DNA extracted by the high- 
throughput method. 



16 New England Biolabs, Beveryly, MA 01915. 
17 New England Nuclear, Boston, MA 02118. 
18 Amersham Pharmacia Biotech, Inc., Piscataway, NJ 08855. 
19 Kodak, Rocherster, NY 14650. 
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FIGURE Z NLB vector constructs NLB (A), NLB-CMV-BL (B), and NLB-OV-CAT (C). Sequences marked are 5' and 3' long terminal repeats 
(LTR); genes for neo r , kcZ, /3-lactamase (bla), and chloramphenicol acetyl-transferase (CAT); and remnants of gag and env. The estimated positions 
of restriction sites relative to beginning of the 5' LTR are denoted in basepairs. The length from 5' to 3' LTR is noted at the end of each vector. 
Because NLB has not been sequenced in its entirety, measurements in basepairs are estimated from published data (Cosset et al., 1991; Thoraval 
et al., 1995) and our unpublished data. 



est During amplification, the probe anneals to the ampli- 
fied sequence and is cleaved by the 5'— »3' exonuclease 
activity of Taq DNA polymerase, releasing the reporter 
dye from the quencher dye. With each cycle, additional 
reporter dye molecules are cleaved from the probe, and 
the increase in fluorescence intensity is measured at each 
cycle. At the end of a run, accumulated fluorescence inten- 
sity is plotted against cycle number. An approximately 
twofold increase in the copy number of a target sequence 
will result in the amplification initiating one cycle earlier. 



To demonstrate the feasibility of screening avian geno- 
mic DNA extracted by the high-throughput method using 
the 7700 SDS, a primer/probe set complementary to the 
chicken GAPDH gene was designed. DNA samples (2 
/iL) from nontransgenic White Leghorn chicks selected 
at random were added to each Taqman reaction. The 
average amount of DNA in each reaction was 362.5 ng ± 
116 ng (n = 21). Twenty-one DNA samples gave rise to 
similar amplification plots, whereas three reactions with 
no DNA displayed no amplification (Figure IB). Amplifi- 



TABLE 1. Summary of transgenesis with NLB vectors 1 



Transgene 



NLB 



NLB-CMV-BL 



NLB-OV-CAT 



Production of GO founder flock 
Number of injections 
Number of birds hatched (%) 
Number of chicks with transgene 

in their blood DNA (%) 
Number of males 
Number of males with transgene 

in their sperm DNA (%) 
Number of males that transmitted 

transgene to progeny (%) 

Production of Gl flock 

Number of chicks bred from GO males 

Number of Gl transgenics 

Rate of germline transmission 
Production of G2 flock 

Number of chicks bred from Gl transgenics 

Number of G2 transgenics 

Rate of germline transmission 



431 

153 (35.5%) 
36 (23.5%) 

65 
4 (6.2%) 

1 (1.5%) 



821 
1 

0.12% 

26 
12 

46.2% 



546 
126 (23.1%) 
36 (28.6%) 

56 
3 (5.4%) 

1 (1.8%) 



1,026 
3 

0.29% 

120 
61 
50.8% 



497 
116 (23.3%) 
ND 

56 
3 (5.4%) 

1 (1.8%) 



849 
3 

0.35% 

60 
31 
51.7% 



l G0 = Generation 0; Gl = Generation 1; G2 = Generation 2; ND = not determined. 
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cation curves of the samples did not overlap exactly but 
varied by approximately 0.3 to 1.0 cycles, depending on 
the threshold ARn (data not shown). This variability was 
to be expected if a fixed volume rather than fixed mass 
of DNA was added to each reaction. Thus, even though 
DNA concentration was variable in the assay, it could 
be used qualitatively to detect the presence of a specific 
sequence in the avian genome. 

Production of Founder Males 

The original version of NLB and two modified forms 
were used to produce transgenic founder flocks harboring 
three different transgenes, as shown in Figure 2. The first 
vector, NLB, contains the lacZ reporter and neomycin 
resistance (na/) genes (Figure 2A) (Thoraval et al., 1995). 
ned is expressed from the LTR promoter and lacZ as a 
subgenomic RNA transcribed from the same promoter. 
The second vector, NLB-CMV-BL, was constructed by 
replacing the 3.0-kb lacZ gene with the CMV promoter 
fused to the /^-lactamase gene (bla) (Figure 2B). The third 
vector, NLB-OV-CAT, was constructed by replacing lacZ 
with a 1.4-kb segment of the chicken ovalbumin promoter 
and the chloramphenicol acetyl-transferase (CAT) coding 
sequence (Figure 2C). 

For NLB transduction, we used freshly laid fertilized 
White Leghorn eggs. Seven to ten microliters of concen- 
trated particles were injected into the subgerminal cavity 
of windowed eggs, and chicks hatched after sealing the 
window. We also used SPAFAS White Leghorn eggs for 
NLB-CMV-BL and NLB-OV-CAT transductions. The 
number of eggs injected ranged from 431 to 546. For NLB 
and NLB-CMV-BL chicks, blood DNA was extracted and 
analyzed for the presence of the transgene using a probe- 
primer set designed to detect the neo r gene via the Taqman 
assay. As shown in Table 1, approximately 25% of all 
chicks had detectable levels of transgene in their blood 
DNA, but the levels were just above background (data 
not shown), suggesting that the percentage of blood cells 
with transgene was low (<1%). 

Germline Transmission of the Transgene 

Taqman detection of the neo r gene in sperm DNA was 
used to identify candidate GO males for breeding. As 
shown in Figure 3, three GO males each were identified 
that harbored the NLB-OV-CAT (Figure 3A) or NLB- 
CMV-BL (Figure 3B) transgenes in their sperm DNA at 
levels that were above background. All GO males positive 
for the transgene in their sperm were bred to non- 
transgenic hens to identify fully transgenic Gl offspring. 
For NLB, 821 chicks were screened before a single 
transgenic Gl chick (No. 3092) was identified (Table 1). 
Because all of the blood cells contained a copy of the 
transgene, the amplification curve in the Taqman reaction 
with DNA from 3092 initiated at an early cycle and was 
easily distinguishable from nontransgenics (Figure 4). 
This procedure also allowed us to pool samples because, 
by using DNA from Chick 3092, we determined that DNA 




FIGURE 3. Taqman analysis of sperm DNA. DNA was extracted 
from the sperm of Generation 0 (GO) males and screened for the presence 
of NLB-OV-CAT (A) or NLB-CMV-BL (B). Taqman reactions with prim- 
ers and a TET/TAMRA -modified probe complementary to neo r was 
used to detect the presence of the transgene in 100 ng of sperm DNA. 
In A, curves with diamonds correspond to Male 1638, pluses to Male 
1672, empty boxes to Male 1676, and filled boxes to nontransgene- 
bearing males. In B, curves with diamonds correspond to Male 2421, 
pluses to Male 2428, empty boxes to Male 2395, and filled boxes to 
nontransgene-bearing males. 



pooled from six birds could be reliably assayed in a single 
Taqman reaction, which significantly decreased reagent 
costs and labor and allowed up to 564 samples to be 
screened in one 96-well plate with positive and negative 
control reactions (data not shown). 

For NLB-CMV-BL and NLB-OV-CAT, 1,026 and 849 
chicks were bred, respectively, and three Gl chicks were 
obtained for each transgene (Table 1). For each transgene, 
all Gl progeny came from the male with the highest 
level of transgene in his sperm DNA, even though an 
equivalent number of chicks were bred from each male. 
NLB-OV-CAT Roosters 1672 and 1676 had similar levels 
of transgene in their sperm DNA (Figure 3A), yet only 
1676 gave rise to transgenic offspring. Attempts to quan- 
tify the copy number of the transgene in positive sperm 
samples suggested that the rate of germline transmission 
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FIGURE 4. High throughput screening for NLB transgenic offspring. 
DNA from 82 chicks bred from a GO male chimera were screened for 
the presence of NLB. Curves initiating after cycle 35 arose from non- 
transgenic chicks. The pair of curves initiating at cycle 18 came from 
DNA extracted in duplicate from the same transgenic chick (male 3092, 
also known as "ALVin"). 



would be as high as 10% of progeny (data not shown). 
This result was much higher than the 1% obtained for 
the males that gave rise to transgenics and the 0.3% for 
all males that were positive for the transgene in their 
sperm DNA. Why this variation occurred is unknown. 

Southern Analysis of G1 and G2 Birds 

To confirm integration and integrity of the inserted 
vector sequences, Southern blot analysis was performed 
on DNA from Gl and G2 transgenics. Blood DNA was 
digested with Hmcffll and hybridized to a necf probe to 
detect junction fragments created by the internal Hiti&TH 
site found in all three vectors (Figure 2) and genomic 
sites flanking the site of integration. The results for birds 
carrying NLB and NLB-CMV-BL are shown in Figure 5. 
Each of the three Gl birds carrying NLB-CMV-BL had 
a junction fragment of unique size, indicating that the 
transgene had integrated into three different genomic 
sites (Figure 5, Lanes 3 to 5). Gl roosters were bred to 
nontransgenic hens to obtain hemizygous G2. As shown 
in Table 1, 50.8% of offspring from Gl roosters harboring 
NLB-CMV-BL were transgenic, as expected for Mende- 
lian segregation of a single integrated transgene. Southern 
analysis of HmdIH-digested DNA from G2 offspring de- 
tected junction fragments similar in size to those originat- 
ing from their transgenic parents (Figure 5, Lanes 7 to 9), 
indicating that the transgene was transmitted intact. A 
probe complementary to the bla gene also detected junc- 
tion fragments of varying sizes in Hind HI digests (data 
not shown). 

Breeding NLB Transgenic 3092 to a nontransgenic hen 
yielded G2 offspring of which 46.2% were transgenic (Ta- 
ble 1). DNA from a representative 3092-derived G2 
transgenic bird was also assayed by Southern blot, and 
the result is shown in Figure 5 (Lane 10). 

Similar analysis was done for tracking the NLB-OV- 
CAT transgene, as shown in Figure 6. Junction fragments 
of two sizes were evident in the three NLB-OV-CAT Gl 




FIGURE 5. Southern blot analysis of DNA from Generation 1 (Gl; 
left panel) and G2 (middle panel) transgenic chickens harboring NLB- 
CMV-BL. DNA were digested with Hmdin, and the blot was probed 
with a fragment from the neo r gene. Band numbers are above each lane. 
DNA from a nontransgenic offspring (NTO) is shown in Lane 2; parent- 
offspring relationships are indicated with arrows. Bird 5211 (right panel) 
harbors the NLB transgene and was bred from 3092 ("ALVin"). Sizes 
are indicated to the left in kilobases. MW ~ molecular weight. 



birds (3535, 4706, and 4576), both of which were shared 
by 4706 (Figure 6, Lanes 3 to 5). The similar size of the 
bands suggests that 4706 harbored both integration sites 
carried by 3535 and 4576, which was possible as all three 
Gl birds were bred from the same GO male. This size 
relationship was corroborated by a probe complementary 
to the CAT gene that also detected Hmdm junction frag- 
ments that were unique in 3535 and 4576 and both of 
which were shared by 4706 (data not shown). This shared 
junction most likely arose from a primordial germ cell 
containing two sites of integration which, when sexually 
mature, gave rise to all three NLB-OV-CAT Gl transgenic 
progeny. The transgenes segregated into different sperm 
during spermatogenesis in the case of 3535 and 4576 but 
cosegregated in the sperm that gave rise to 4706. 

The NLB-OV-CAT Gl birds were mated to non- 
transgenics, and 51.7% of the offspring carried the 
transgene (Table 1). G2 birds 8591 and 8255, bred from 
3535 and 4576, respectively, harbored intact junction frag- 
ments (Figure 6, Lanes 7 to 8). Of five G2 offspring bred 
from 4706 analyzed by Southern blotting, four carried 
either one or the other copy of the transgene, whereas 
one offspring carried both (Figure 6, Lanes 9 to 13), again 
demonstrating that the integration sites were unlinked. 
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FIGURE 6. Southern blot analysis of DNA from Generation 1 (Gl; 
left panel) and G2 (right panel) transgenic chickens harboring NLB- 
OV-CAT. Digests and the probe are the same as in Figure 5. Band 
numbers are above each lane. DNA from nontransgenic offspring (NTO) 
is shown in Lane Z Parent-offspring relationships are indicated with 
arrows. MW = molecular weight 



Screening for G3 Progeny 
Homozygous for the Transgene 

To obtain transgenic chickens homozygous for the 
transgene, we crossbred G2 hemizygous birds bearing 
NLB-CMV-BL integrated at the same site (e.g., progeny 
of the same Gl male). Two groups were bred: the first 
was a hen and rooster arising from the Gl 4133 male, 
and the second was from the Gl 5657 hen. The Taqman 
assay was used to quantitatively detect the neo r transgene 
in G3 progeny using a standard curve, and the results 
for the crosses with 4133 are shown in Figure 7. The 
standard curve was constructed using known amounts 
of genomic DNA from the Gl transgenic 4133 male hemi- 
zygous for the transgene as determined by Southern anal- 
ysis (Figure 5, Lane 3). The standard curve ranged from 
10 3 to 1.6 x 10 4 total copies of the transgene or 0.2 to 3.1 
transgene copies per diploid genome (Figure 7A). Because 
reaction components were not limited during the expo- 
nential phase, amplification was very efficient and gave 
reproducible values for a given copy number. It could be 
observed that there was a reproducible, one-cycle differ- 
ence between each standard curve differing twofold in 
copy number (Figure 7A). 

To determine the number of transgene alleles in G3 
offspring, DNA were amplified and compared to the stan- 
dards. DNA from nontransgenics did not amplify (Figure 
7B). Birds homozygous for the transgenic allele gave rise 
to plots initiating amplification one cycle earlier than 
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FIGURE 7. Determination of transgene copy number. (A) A standard 
curve for transgene copy number was produced with genomic DNA 
from 4133, which harbors one copy of NLB-CMV-BL per diploid ge- 
nome. Standards (Std) were 10 3 copies of the transgene (Stdl), 2 x 10 3 
copies (Std2), 4 x 10 3 copies (Std3), 8 x 10 3 copies (Std4), 1.6 x 10* copies 
(Std5) corresponding to 0.2 copies (Stdl), 0.4 copies (Std2), 0.8 copies 
(Std3), 1.6 copies (Std4), 3.1 copies (Std5) of transgene per diploid ge- 
nome. Boxed points distinguish amplification curves from reactions 
performed in triplicate. The no template control (NTC) contained no 
DNA. (B) DNA from three potential Generation 3 (G3) offspring bred 
from a hen and rooster arising from the Gl 4133 male were assayed in 
triplicate. (Q Southern blot analysis of DNA from G3 offspring. DNA 
were digested with Pstl, equal amounts of DNA were electrophoresed, 
and the blot was probed with a fragment from the net/ coding sequence. 
The expected 0.9-kb fragment is marked. DNA from G3 birds bred from 
a hen and rooster arising from the Gl 4133 male are shown in Lanes 
1 to 8. Bird 4133 is a Gl transgenic carrying one copy of the transgene. 
DNA from a nontransgenic offspring (NTO) is shown in Lane 10. Copy 
number as determined by Taqman analysis is indicated at the top of 
each lane. 
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TABLE 2. Determination of transgene copy number in G3 offspring bred from G2 transgenics 1 



Gl Parent 


Band no. 
(Std or 
NTC) 


Cycle 2 
threshold 


Mean total 
copy number 


Standard 
deviation 


Copies per 
diploid 
genome 3 


MA 4 


41 11 
. *±LDO 


27.3 


3,975 


145.7 


1 


MA 


47fw; 

4/UO 


26.2 


7,951 


417.4 


. 2 


4111 


0/7/ 


40.0 


0 


0.0 


0 


JOJ/ 


07/ / 




10,510 


587.0 


2 


JW/ 


07/0 




10,401 


505.1 


; 2 


4111 


7non 
/uzu 


ZO.7 


6,064 


443.1 


1 


4111 


/UZ1 


26,8 


5,239 


133.8 


1 


41 11 


/uzz 


26.1 


9,096 


352.3 


2 


4.111 




26.8 


5,424 


55.7 


1 


/I1 11 


7nOyl 


26.9 


4,820 


110.1 


1 


DOD/ 


71 m 
/ 111) 


26.4 


8,092 


1,037.5 


2 


DoD/ 


711 i 

/111 


30.4 


403 


46.3 


0 


DOD/ 


7112 


33.2 


60 


6.1 


0 


4111 


71 AO 


26.5 


6,023 


367.6 


1 


41 11 


7141 


25.9 


9,474 


569.8 


2 


4111 


71 44 
/ 144 


25.7 


12,420 


807.7 


2 


41 11 


711ft 


27.2 


4,246 


201.7 


1 




/4U/ 


37.7 


1 


1.0 


0 


NA 




9Q 1 


1 nnrv 
1,UUU 


0.0 


0.2 


NA 


(Std2) 


28.1 


2,000 


0.0 


0.4 


NA 


(Std3) 


27.1 


4,000 


0.0 


0.8 


NA 


(Std4) 


26.2 


8,000 


0.0 


1.6 


NA 


(Std5) 


25.3 


16,000 


0.0 


3.1 


NA 


(NTC) 


39.8 


-1 


0.0 


0.0 



*G1 = Generation 1; G2 = Generation 2; G3 = Generation 3; Std = standard number; NTC = no template control, 
^ycle at which a sample's fluorescence exhibited a significant increase above background. 
3 Copies per diploid genome were determined by dividing the mean by 5,100 and rounding to the nearest 
first decimal place. 

4 NA = not applicable. 



those hemizygous for the allele (Figure 7B). The sequence 
detection program was able to calculate the number of 
alleles in an unknown DNA sample based on the standard 
curve and the cycle threshold at which a sample's ampli- 
fication plot exhibited a significant rise. The data are 
shown in Table 2. The copy number of 4706, a NLB-OV- 
CAT Gl, shown to have two copies of the transgene by 
Southern analysis (Figure 6, Lane 4), was confirmed by 
this analysis as well (Table 2). 

To confirm Taqman copy number analysis, DNA of 
selected birds were analyzed by Southern blotting with 
Psf I-digested DNA and a probe complementary to the neo r 
gene to detect a 0.9-kb fragment (Figure 7C). Detection of 
a small fragment was chosen as transfer of smaller DNA 
from gel to membrane is more quantitative. The signal 
intensity of the 0.9-kb band corresponded well to the 
copy number of G3 transgenic birds, as determined by 
an additional Taqman assay. The copy numbers of addi- 
tional 18 G3 transgenic birds analyzed by Southern blot- 
ting were also consistent with that determined by Taqman 
(data not shown). Thirty-three progeny were analyzed 
for the 4133 lineage, of which nine (27.3%) were non- 
transgenic, 16 (48.5%) were hemizygous, and eight 
(24.2%) were homozygous. Ten progeny were analyzed 
for the 5657 lineage, of which five (50.0%) were non- 
transgenic, one (10.0%) was hemizygous, and four (40.0%) 
were homozygous. The observed ratio of nontransgenics, 
hemizygotes, and homozygotes for the 4133 lineage G3 
progeny was not statistically different from the expected 
1:2:1 ratio as determined by the x 2 test (P < 0.05). Progeny 



of the 5657 lineage did not have the expected distribution, 
which could have been due to the few progeny tested. 

DISCUSSION 

At the onset of this study, we sought to investigate the 
utility of the NLB vector system for several reasons. Of all 
of the available methods to produce transgenic chickens, 
retroviral vectors have the most consistent record of 
germline transmission. Of these, the NLB system was 
well characterized for the lack of production of replication 
competent retroviruses (Cosset et al., 1990, 1991, 1992), 
which was an important feature because of the implica- 
tions for the health and safety of our transgenic chickens 
and their products. The NLB vector carries a minimal 
sequence payload between the LTR and thus can accept 
transgenes of at least 3 kb (the length of the lacZ region 
in NLB) and possibly up to 8 kb. As an additional safe- 
guard for investigators, ALV-based vectors only infect 
aves. 

By implementation of several recently developed and 
novel procedures, we confirmed the utility of the NLB 
system for chicken transgenesis. The first involved a win- 
dowing method (Speksnijder and Ivarie, 2000) so that we 
were able to hatch many potential founder chickens. We 
incorporated the Taqman assay into the analysis of sperm 
DNA of potential founder males because it was rapid 
and gave quantitative estimates of the transgene copy 
number and allowed the rapid identification of males 
for breeding. 
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In order to facilitate identification of transgenic off- 
spring, we developed a DNA extraction method that, in 
* combination with the Taqman assay, could screen hun- 
dreds or thousands of chucks rapidly. Although high- 
throughput methods for sequence detection were avail- 
able, existing methods for extraction of DNA from avians 
have been labor-intensive and time-consuming or not 
compatible with fluorescence-based sequence detection 
systems (Salter et al., 1986; Grimberg et al v 1989; Petitte 
et al., 1994; Thoraval et al., 1995; Bercovich et aL, 1999). 
We discovered that a high-throughput method used to 
extract DNA from mouse embryonic stem cells (Ramirez- 
Solis et al., 1992; Udy and Evans, 1994) could be modified 
to extract DNA directly from chick blood. Because of 
the method's reproducibility, DNA concentration did not 
have to be determined prior to analysis and could be 
used in standard PCR and Southern blot assays. The Taq- 
man assay and the high-throughput DNA extraction 
method also facilitated segregation of progeny that were 
nontransgenic, hemizygous, or homozygous. A similar 
Taqman assay was recently used to detect twofold differ- 
ences in transgene copy number in plants (Ingham et 
al., 2001). 

Germline transmission was achieved with three differ- 
ent transgenes. All three were integrated into single sites 
of the avian genome, except for one Gl rooster harboring 
the transgene in two sites. Once integrated, the transgenes 
appeared to be stable as we tracked them through three 
generations for two of the transgenes and four genera- 
tions for NLB-CMV-BL. All of the transgenic flocks have 
been healthy and have shown no signs of viremia or 
detectable levels of ALV particles in their blood. The neo- 
mycin resistance and lacZ genes were functional in NLB- 
bearing fibroblasts derived from transgenic G2 embryos 
bred from 3092 (data not shown), confirming the observa- 
tions of Thoraval et al. (1995). We will report separately 
on the expression of the /^-lactamase transgene in blood 
and egg white as well as expression of the CAT gene. 

Our data revealed a low rate of germline transmission, 
which can be expected for most transgenes inserted into 
the chicken germline via the NLB system. That is, less than 
5% of GO males will give rise to transgenic Gl progeny at 
an efficiency of 1% or less. This efficiency is lower than 
the 2.2% transmission rate reported by Thoraval et al. 
(1995) and is lower than a replication<ompetent ALV 
system in which 25% of GO males gave rise to transgenic 
progeny at frequencies up to 11% (Salter et al., 1986, 1987). 
Bosselman et al. (1989) reported that 5% of GO males 
transduced with REV replication-deficient particles gave 
rise to transgenic progeny at frequencies of 2 to 8%. How- 
ever, the packaging cell line used to package this vector 
was found to produce replication-competent virus (Hu 
et al., 1987). For replication-competent ALV, the vector 
probably replicated and transduced additional cells after 
the initial embryo injection, thereby increasing the pro- 
portion of transduced somatic and germline cells, which 
may also have been the case in embryos transduced with 
the REV-derived particles. The low rate of germline trans- 
mission from founders to fully transgenic offspring using 



existing methods was partially overcome by the im- 
proved screening methods described here. 

Blood DNA from GO chickens contained transgenes 
in less than 1% genome equivalents (data not shown), 
indicating that the efficiency of transduction in somatic 
cells is similar to that for germ cells. Our data is consistent 
with an earlier study that showed 1% of stage X blasto- 
dermal cells contained an integrated provirus after multi- 
plicity of infection (MOI) of 2 to 12 (Thomas et al., 1992). 
The low transduction efficiency is an enigma as a stage 
X (Eyal-Giladi and S. Kochav, 1976) embryo typically 
consists of 30,000 to 50,000 cells, and 20,000 to 70,000 
transduction particles were injected per embryo into the 
subgerminal cavity, which should allow access to most 
of the embryonic cells including primordial germ cells 
(Kochav et al., 1980). Possibly only a fraction of the viral 
particles actually contact the blastoderm, suggesting that 
higher titer preparations could increase the transduc- 
tion rate. 
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